WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 
C12Q 1/68 



A2 



(11) International Publication Number: WO 98/53097 

(43) International Publication Date: 26 November 1998 (26.1 1.98) 



(21) International Application Number: PCT/CA98/00488 

(22) International Filing Date: 21 May 1998 (21.05.98) 



(30) Priority Data: 

08/861 ,774 



22 May 1997 (22.05.97) 



US 



(71) Applicant: TERRAGEN DIVERSITY INC. [CA/CA]; Univer- 

sity of British Columbia, Suite 300, 2386 East Mall, Van- 
couver, British Columbia V6T 1Z3 (CA). 

(72) Inventors: WATERS, Barbara; 5706 Timbervalley Road, 

Delta, British Columbia V4L 2E6 (CA). MIAO, Vivian, P., 
W.; 13750 31 Avenue, Surrey, British Columbia V4P 2B7 
(CA). YAP, Wai, Ho; 5 Elite Terrace, Singapore 458748 
(SG). SEOW, Kah, Tong; 8 Jin Aneka, Serene Park, Johor 
Baru, Johor 80300 (MY). 

(74) Agent: DEETH WILLIAMS WALL; National Bank Bulling, 
Suite 400 f 150 York Street, Toronto, Ontario M5H 3S5 
(CA). 



(81) Designated States: AU, CA, JP, European patent (AT, BE, 
CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, 
NL, PT, SE). 



Published 

Without international Search report and io he republished 
upon receipt of that report. 



(54) Title: METHOD FOR ISOLATION OF BIOSYNTHESIS GENES FOR BIOACTIVE MOLECULES 



(57) Abstract 



Degenerate primers which hybridize with various classes of antibiotic biosynthesis gene were used to amplify fragments of DNA 
from soil and lichen extracts. Cloning and sequencing of the amplified products showed that these products included a variety of novel 
and previously uncharacterized antibiotic biosynthesis gene sequences, the products of which have the potential to be active as antibiotics, 
immunosuppressors, antitumor agents, etc. Thus, antibiotic biosynthesis genes can be recovered from soil or lichens by combining a sample 
with a pair of amplification primers under conditions suitable for polymerase chain reaction amplification, wherein the primer set is a 
degenerate primer set selected to hybridize with conserved regions of known antibiotic biosynthetic pathway genes, for example Type I and 
Type II polyketide synthase genes, isopenicillin N synthase genes, and peptide synthetase genes, cycling the combined sample through a 
plurality of amplification cycles to amplify DNA complementary to the primer set; and isolating the amplified DNA. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


Fl 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Auslria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


im 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IB 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


us 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Vict Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


C6te d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 98/53097 



PCT/CA98/00488 



- 1 - 

METHOD FOR ISOLATION OF BIOSYNTHESIS GENES 
FOR BIOACTIVE MOLECULES 

DESCRIPTION 

BACKGROUND OF THE INVENTION 
This application relates to a method for the isolation of biosynthesis genes for 
antibiotics and other bioactive molecules from complex natural sources such as humus, soil 
and lichens. 

5 Antibiotics play an important role in man's efforts to combat disease and other 

economically detrimental effects of microorganisms. Traditionally, antibiotics have been 
identified by screening microorganisms, especially those found naturally in soil, for their 
ability to produce an antimicrobial substance. In some cases, the gene or genes responsible 
for antibiotic synthesis have then been identified and cloned into producer organisms which 
10 produce the antibiotic in an unregulated manner for commercial applications. However, it 
has been estimated that less than 1% of the microorganisms present in soil are culturable. 
Torsvik et al., Appl Environ, Microbiol 56: 782-787 (1990). Thus, much of the genetic 
diversity potentially available in soil microorganisms is unavailable through traditional 
techniques. 

1 5 As pathogenic microorganisms become increasingly resistant to known 

antibiotics, it would, however, be highly desirable to be able to access the reservoir of genetic 
diversity found in soil, and to facilitate the exploration of new species of antibiotics which 
may be made by the vast numbers of unculturable organisms found there. It would further be 
desirable to have access to novel biosynthetic enzymes and the genes encoding such enzymes, 

20 which could be used in recombinant organisms for antibiotic production or for in vitro 

enzymatic synthesis of desirable compounds. Thus, it is an object of the present invention to 
provide a method and compositions for isolating DNA and DNA fragments encoding 
enzymes relevant to the production of pharmaceutically active molecules such as antibiotic 
biosynthesis enzymes. 

25 
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SUMMARY OF THE INVENTION 
We have now identified degenerate primers which hybridize with various 
classes of antibiotic biosynthesis genes, and have used such primers to amplify fragments of 
DNA from soil and lichen extracts. Cloning and sequencing of the amplified products 
5 showed that these products included a variety of novel and previously uncharacterized 

antibiotic biosynthesis gene sequences, the products of which have the potential to be active 
as antibiotics, immunosuppressors, antitumor agents, etc. Thus, antibiotic biosynthesis genes 
can be recovered from soil by a method in accordance with the present invention comprising 
the steps of: 

10 (a) combining a soil-derived sample with a pair of amplification primers 

under conditions suitable for polymerase chain reaction amplification, wherein the primer set 
is a degenerate primer set selected to hybridize with conserved regions of known antibiotic 
biosynthetic pathway genes, for example Type I and Type II polyketide synthase genes, 
isopenicillin N synthase genes, and peptide synthetase genes; 

1 5 (b) cycling the combined sample through a plurality of amplification 

cycles to amplify DNA complementary to the primer set; and 
(c) isolating the amplified DNA. 

DETAILED DESCRIPTION OF THE INVENTION 
20 In accordance with the present invention, antibiotic biosynthesis genes can be 

recovered from soil and lichens by a method comprising the steps of: 

(a) combining a humic or lichen-derived sample with a pair of 
amplification primers under conditions suitable for polymerase chain reaction amplification, 
wherein the primer set is a degenerate primer set selected to hybridize with conserved regions 

25 of an antibiotic biosynthesis gene; 

(b) cycling the combined sample through a plurality of amplification 
cycles to amplify DNA complementary to the primer set; and 

(c) isolating the amplified DNA. 

As used in the specification and claims of this application, the term "humic or 
30 lichen-derived sample" encompasses any sample containing the DNA found in lichens or in 
samples of humic materials including soil, mud, peat moss, marine sediments, and effluvia 
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from hot springs and thermal vents in accessible form for amplification, substantially without 
alteration of the natural ratios of such DNA in the sample. One exemplary form of a humic 
sample is a sample obtained by performing direct lysis as described by Barns et al., Proc. 
Nat' I Acad. ScL USA 91:1609-1613 (1994) on a soil sample and then purifying the total DNA 
5 extract by column chromatography. Related extraction methods can be applied to the 

isolation of community DNA from other environmental sources. See, Trevors et al., eds. 
Nucleic Acids in the Environment, Springer Lab Manual (1995). Lichen-derived samples 
may be prepared from foliose lichens by the method of fungal DNA extraction described by 
Miao et al, MoL Gen. Genet 226: 214-223 (1991). Specific non-limiting procedures for 

10 isolation of DNA from humic and lichen samples are set forth in the examples herein. 

The humic or lichen-derived sample is combined with at least one, and 
optionally with several pairs of amplification primers under conditions suitable for 
polymerase chain reaction amplification. Polymerase chain-reaction (PCR) amplification is a 
well known process. The basic procedure, which is described in US Patent No. 4,683,202 

15 and 4,683,195, which are incorporated herein by reference, makes uses of two amplification 
primers each of which hybridizes to a different one of the two strands of a DNA duplex. 
Multiple cycles of primer extension using a polymerase enzyme and denaturation are used to 
produce additional copies of the DNA in the region between the two primers. In the present 
invention, PCR amplification can be performed using any suitable polymerase enzyme, 

20 including Taq polymerase and Thermo Sequenase™. 

The amplification primers employed in the method of the invention are degenerate 
primer sets selected to hybridize with conserved regions of known antibiotic biosynthetic 
genes, for example Type I and Type II polyketide synthase genes, isopenicillin N synthase 
genes, and peptide synthetase genes. Each degenerate primer set of the invention includes 

25 multiple primer species which hybridize with one DNA strand, and multiple primer species 
which hybridize with the other DNA strand. All of the primer species within a degenerate 
primer set which bind to the first strand are the same length, and hybridize with the same 
target region of the DNA. These primers all have very similar sequences, but have a few 
bases different in each species to account for the observed variations in the target region. For 

30 this reason, they are called degenerate primers. 
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Similarly, all of the primers within a degenerate primer set which bind to the second strand 
are the same length, hybridize with the same target region of the DNA, and have very similar 
sequences with a few bases different in each species to account for the observed variations in 
the target region. 

5 The degenerate primer sets of the invention are selected to hybridize to highly 

conserved regions of known antibiotic biosynlhesis genes in such a way that they flank a 
region of several hundred (e.g. 300) or more base pairs such that amplification leads to the 
selective reproduction of DNA spanning a substantial portion of the antibiotic biosynthesis 
gene. Selection of primer sets can be made based upon published sequences for classes of 

10 antibiotic biosynthesis genes. 

For example, for amplification of Type I polyketide synthase genes, we have 
designed primers based upon the conserved sequences of six beta-ketoacyl carrier protein 
synthase domains of the erythromycin gene cluster. Donadio et al Science 252: 675-679 
(1991); Donadio and Staver, Gene 126: 147-151 (1993). These primers have the sequences 

1 5 5'-GC(C/G) (A/G)T(G/C) GAC CCG C AG CG CGC-3' [SEQ ID No. 1 ] 

and 

5'-GAT (C/G)(G/A)C GTC CGC (G/A)TT (C/G)GT (C/G)CC-3' [SEQ ID No. 2]. 

The expected size of the PCR product is 1 .2 kilobase pairs. Other degenerate primer sets for 
Type I and Type II polyketide synthetase genes could be determined from sequence 

20 information available in Hutchinson and Fujii, Ann. Rev. Microbiol. 49: 201-238 (1995). 

Type II polyketide synthase gene clusters are characterized by the presence of 
chain length factor genes which are arranged at the 3'-end of the ketosynthase genes. Primers 
were designed based on one conserved region near the 3 ! -end of the ketosynthase gene and 
one at the middle portion of the chain length factor gene. The sequences of one suitable set 

25 of amplification primers are: 

5' CT(C/G)AC(G/C)(G/T)(C/G)GG(C/G)CGIAC(C/G)GC(C/G)AC(C/G)CG-3 , SEQ ID No. 3 
and 

5' GTT(C/G)AC(C/G)GCGTAGAACCA(C/G)GCGAA-3 , SEQ ID No. 4 

The expected size of the PCR product was 0.5 kilobase pairs. An alternative set of 
30 degenerate primers has the sequence 

5 , -TTCGG(C/G)GGITTCCAG(T/A)(C/G)IGC(C/G)ATG SEQ ID No. 5 
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and 

5 f -TC(C/G)A(G/T)(C/G)AG(C/G)GC(C/G)AI(C/G)GA(C/G)TCGTAICC SEQ ID No. 6. 
These primers were designed based upon consensus sequences for the regions flanking the 
Ksp (chain length factor) genes. The consensus sequences are available from Hutchinson and 
5 Fujii, supra. 

Primers were designed for beta-lactam biosynthetic genes on the basis of Ihe 
conserved sequences of a number of isopenicillin N synthase genes as described in 
Aharanowitz et al., Ann. Rev. Microbiol. 46: 461-495 (1992). These primers have the 
sequences 

1 0 5 f -GG(C/G/T) TC(C/G) GG(C/G) TT(C/T) TTC TAC GC-3' [SEQ ID No. 7] 

and 

5*-CCT (C/G)GG TCT GG(A/T) A(C/G)A G(C/G)A CG-3* [SEQ ID No. 8], 

The expected size of the PCR product is 570 base pairs. Other degenerate primer sets could 
be determined from sequence information available in Jensen and Demain, "Beta-Lactams" in 

1 5 Genetics and Biochemistry of Antibiotic Production (L.C. Vining and C, Studdard, eds.), pp 
239-268, Butterworth-Heinemann, Newton, MA (1995). 

For isolation of peptide synthetase genes, primers based on two of the 
conserved core sequences within the functional domains of peptide synthetase genes as 
described by Turgay and Marahiel, Peptide Res. 7: 238-241 (1994) were utilized. These 

20 primers had the sequence 

5 , -ATCTACAC(G/C)TC(G/C)GGCAC(G/C)AC(G/C)GGCAAGCC(G/C)AAGGG-3 , 

SEQ ID No. 9 

and 

25 5 t -A(A/T)IGAG(T/G)(C/G)ICCICC(G/C)(A/G)(A/G)(G/C)I(A/C)GAAGAA-3 , 

SEQ ID No. 10 

The expected size of the PCR product is 1 .2 kilobase pairs. 

PCR amplification can also be used for isolating lichen-derived antibiotic 
biosynthesis genes and gene fragments. For isolation of Type I polyketide synthase genes 
30 from lichens, the primer set used was previously described by Keller et al. in Molec. Appl. to 
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Food Safety Involving Toxic Microorganisms, J.L. Richard, e<±, pp. 2630277 (1995), and had 
the following sequences. 

5 f -MGIGARGCIYTIGCIATGGAYCCICARCARMG SEQ ID No. 1 1 

and 

5 S'-GGRTCNCCIARYTGIGTICCIGTICCRTGIGC SEQ ID No. 1 2 

The expected size of the PCR product is approximately 0.7 to 0.9 kilobases. Actual products 
evaluated ranged in size from 637 to 809 nucleotides (not including the 61 nt due to the 
primers). 

Once the primers and the sample are cycled through sufficient thermal cycles 
10 to selectively amplify antibiotic biosynthetic DNA in the sample (generally around 25 cycles 
or more), the amplified DNA is isolated from the amplification mixture. Isolation can be 
accomplished in a variety of ways. For example, the PCR products can be isolated by 
electrophoresis on an agarose or polyacrylamide gel, visualized with a stain such as ethidium 
bromide and then excised from the gel for cloning. Primers modified with an affinity binding 
15 moiety such as biotin may also be used during the amplification step, in which case the 
affinity binding moiety can be used to facilitate the recovery. Thus, in the case of 
biotinylated primers, the amplified DNA can be recovered from the amplification mixture by 
coupling the biotin to a streptavidin-coated solid support, for example Dynal streptavidin- 
coated magnetic beads. 

20 It will be appreciated that the DNA obtained as a result of this isolation will 

not generally be of a single type because of the degeneracy of the primers and the complexity 
of the initial sample. Thus, although these steps are sufficient to recover antibiotic 
biosynthesis genes from soil or lichen, it is preferable to further separate and characterize the 
individual species of amplified DNA. 

25 This further separation and characterization can be accomplished by inserting 

the amplified DNA into an expression vector and cloning in a suitable host. The specific 
combination of vectors and hosts will be understood by persons skilled in the art, although 
bacterial expression vectors and bacterial hosts are generally preferred. Individual clones 
are then picked and the sequence of the cloned plasmid determined. While random selection 

30 has been employed successfully, selection of antibiotic biosynthesis gene-containing clones 
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can be facilitated by screening using hybridization with DNA probes based on conserved 
sequences or by overlay of bacterial clones with an antibiotic-sensitive test strain. 

Once the sequence of the cloned DNA is determined, it can be screened 
against existing libraries of nucleotide and protein sequences for confirmation as an antibiotic 
5 biosynthetic gene or gene fragment. Amplified DNA so-identified can be used in several 
ways. First, the amplified DNA, or distinctive portions thereof, can be used to as probes to 
screen libraries constructed from humic-derived or lichen DNA to facilitate the identification 
and isolation of full length antibiotic biosynthetic genes. Once isolated, these genes can be 
expressed in readily cultivated surrogate hosts, such as a Streptomyces species for soil- 
10 derived genes or an Aspergillus species for lichen-derived genes. General procedures for 
such expression are known 

in the art, for example from Fujii et al., Molec. Gen. Genet. 253: 1010 (1996) and Bedford et 
al., J. BacterioL 177: 4544-4548 (1995), which are incorporated herein by reference. 
Second, amplified DNA which is different from previously known DNA can be used to 

15 generate hybrid antibiotic biosynthesis genes using the procedures described by McDaniel et 
al, Nature 375: 549-554 (1995); Stachelhaus et al., Science 269: 69-72 (1995); and 
Stachelhaus et al, Biochem, Pharmacol. 52: 177-186 (1996). In these procedures, the novel 
DNA sequences isolated using the method of the invention are spliced into a known antibiotic 
gene to provide an expressible sequence encoding a complete gene product. 

20 Using the method of the invention, a number of unique nucleotide sequences 

have been identified and characterized. The sequences and the biosynthetic 
polypeptides/proteins for which they encode, given by sequence ID Nos. 13 to 80, are a 
further aspect of the present invention. 



25 EXAMPLE 1 

Total DNA was extracted from soil samples by a direct lysis procedure as 

described by Barns et al. (1994). The high molecular weight DNA (>20 kb) in the extract 

was separated on a Sephadex G200 column (Pharmacia, Uppsala, Sweden) as described by 

Tsai and Olson, Appl Environ. Microbiol. 58: 2292-2295 (1992). 
30 The DNA extract (10-50 ng template DNA) was added to an amplification 

mixture (total volume 100 |nl) containing 20 mM Tris-HCl (pH 8.4), 50 raM KC1, 2 mM 
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MgCl 2 , 200 |iM of each deoxynucleotide triphosphate, 25 pmol of each Type I polyketide 
primer (Seq ID Nos 1 and 2) and 5.0 units of Taq polymerase (BRL Life Technologies, 
Gaithersburg, MD). The mixture was then thermally cycled for 30 cycles in a MJ Research 
PTC-100 thermocycler using the following program: 
5 denaturation 93 °C 60 seconds 
annealing 60°C 30 seconds 
extension 72 °C 90 seconds 

The PGR products were then electrophoresed in 1 % agarose gels and stained 
with ethidium bromide to visualize the DNA bands. Bands containing PCR product of the 

10 expected size were excised from the gel and purified using a Qiaex Gel Extraction kit (Qiagen 
GmBH). The purified DNA was ligated to pCRII (Invitrogen) to generate a clone library 
using E. coli INVaF competent cells. 18 clones were chosen at random from the library and 
sequenced using a Taq Dye Terminator Cycle Sequencing Kit and an Applied Biosystem 
DNA sequencer model 373. The sequencing primers used included the universal Ml 3 (-20) 

15 forward primer, the Ml 3 reverse primer and primers designed from the sequence data 

obtained. DNA sequences were translated into partial amino acid sequences using a software 
package from Geneworks (Intelligenetics, Inc.) with further manual adjustments and sent to 
the NCBI database by e-mail at blast@ncbi.nlm.nih.gov for comparison against protein 
databases. Altschul et al., "Basic Local Alignment Tool", 7. Mol Biol 215: 403-410 (1990). 

20 Blast analysis of the 18 clones pointed to 12 unique sequences that were not 

identical to each other or to published sequences. Seq. ID No. 13 shows the complete DNA 
sequence of a representative unique clone (Clone ksfs). Seq. ID No. 14 shows the translated 
amino acid sequence of this clone. The greatest homology as determined by a Blast analysis 
is indicated to be Type I polyketide synthases. Similar results were obtained on the Blast 

25 search of the other 1 1 unique clones based upon partial sequences which were determined. 

EXAMPLE 2 

The experiment of Example 1 was repeated using isopenicillin N synthase 
gene primers (Seq ID Nos. 7 and 8). The thermal cycling program was changed to include 60 
30 second extension periods at 72 °C, but otherwise the experimental conditions were the same. 

Twelve clones were picked at random and yielded one unique sequence that was not identical 
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to published sequences. The complete sequence of this clone (Clone ipnsfs) is shown in Seq. 
ID. No. 15 and the translated amino acid sequence in Seq. ID No. 16. The BLAST search 
indicated greatest homology for this sequence with isopenicillin N synthases. 

EXAMPLE 3 

The experiment of Example 1 was repealed using peptide synthetase primers 
(Seq. ID Nos 9 and 10). The amplification mixture was changed to a 50 ul volume containing 
10 to 50 ng of template DNA, 20 mM (NH 4 ) 2 S0 4 , 74 mM Tris-HCl (pH 8.8), 1.5 mM MgCl 2 , 
0.01% Tween 20, 200 jiM of each deoxynucleotide triphosphate, 25 pmol of each primer, 
0.25 % skim milk and 0.4 units of Ultra Therm DNA Polymerase (Bio/Can Scientific, 
Mississauga, Ontario). The mixture was thermocycled for 30 cycles using the following 
program: 

denaturation 95 °C 60 seconds 
annealing 52 °C 60 seconds 
extension 72 °C 120 seconds. 

Thirty clones containing a 1 .2 kb insert have been partially sequenced. The 
BLAST analysis of the 30 clones pointed to 28 unique sequences that were not identical to 
each other or to published sequences. Varying degrees of homology to known peptide 
synthase genes were seen. Seq. ID No. 17 shows the complete DNA sequence of 
representative clone (ps32). Seq. ED No. 18 shows the translated amino acid sequence of this 
clone. Based on a Blast search of these sequences, the greatest homology is to a peptide 
synthase gene such as the pristinamycin synthase gene from Streptomyces pristinaespiralis 
and Bacillus sp. peptide synthetase genes such as gramicidin S synthase and surfactin 
synthetase. Stachelhaus and Marahiel, FEMS Micro. Letters 125: 3-14 (1995); Turgay et al., 
Mol Micro 6: 529-546 (1992). 

Sequence ID Nos. 81 to 94 show an additional 7 unique sequences (nucleic 
acid and translated amino acid sequences) of 1.2 kb PCR products amplified from soil DNA 
samples using these primers. These sequences have been named ps 2, ps 3, ps 7, ps 10, ps 24, 
ps 25 and ps 30. The sequences are unique in that they are all different from each other and 
from ps 32, 
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and while they show greatest homology to peptide synthetase sequences in the databases 
searched by BLAST analysis, they do not match any known sequence. Within each, the 
conserved motifs (TGD, KIRGXRIEL, NGK) common to peptide synthetase domains as 
described by Turgay and Marahiel (1994) can be identified. Descriptive information of the 
5 clones follows: 

Clone ps 2, 1204 bp, with conserved motifs SGD, KIRGFRIEL, NGK, 67% G + C 

Clone ps 3, 1 1 78 bp, with conserved motifs TGD, KIRGSRIEL, NGK, 59 % G + C 

10 Clone ps 7, 1222 bp with conserved motifs TGD, KIRGYRIEL, NGK, 55.5 % G + C 

Clone ps 10, 1 171 bp with conserved motifs TGD, KIRGHRIEL, NLK, 63% G + C 

Clone ps 24, 1 190 bp with conserved motifs TGD, KIRGHRIAM, NQK, 56 % G + C 

15 

Clone ps 25, 1 178 bp with conserved motifs TGD, KLRGYRIEL, NDK 68 % G + C 
Clone ps 30, 1200 bp with conserved motifs TGD, KVRGFRIEP, NGK, 64.5 % G + C 
20 Clone ps 32, 1 1 72 bp with conserved motifs TGD, KIRGFRIEL, SGK, 67 % G + C 

EXAMPLE 4 

The experiment of example 1 was repeated using the Type II polyketide 
synthase primers given by Seq. ID. Nos. 3 and 4. PCR amplification was carried out in a 

25 total volume of 50 ul containing 50 ng of soil DNA, 20 mM Tris-HCl (pH 8.4), 50 mM KC1 , 
2 mM MgCl 2 , 200 uM of each deoxynucleotide triphosphate, 25 pmol of each primer and 5.0 
units of Tag polymerase (BRL Life Technologies, Gaithersburg, MD). The thermal cycling 
conditions included denaturations at 94°C for 60 seconds, annealing at 58°C for 30 seconds 
and extensions at 72 °C for seconds, repeated for a total of 30 cycles. 

30 PCR amplification yielded products of the expected size of 0.5 kilobase pairs. 

Sequencing of 18 randomly selected clones revealed the presence of 5 unique sequence that 
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were not identical to each other or to published sequences. Seq. ID No. 19 shows the 
complete DNA sequence of a representative clone (clone elf). The translated amino acid 
sequence of this clone is shown in Seq. ID. No. 20. In a BLAST search of this DNA 
sequence against the protein database, the greatest homology is indicated to chain length 
5 factor genes of the Type II polyketide synthases. 



Example 5 

The experiment of Example 1 was repeated using the Type I polyketide 
synthase primers designed for fungal sequences. (Seq. ID. Nos. 1 1 and 12) PCR 

1 0 amplifications were carried out with lichen DNA samples from a variety of lichen species 
representing 1 1 genera prepared as described in Miao et al. (1991), supra. 

PCR amplifications were carried out in a total volume of 50 ul containing 
approximately 10 ng of lichen DNA and 1 unit of Tag polymerase in a reaction as per 
Example 4. The cycling protocol was 30 cycles of denaturation at 95 °C for 60 seconds, 

15 annealing at 57°C for 2 minutes and extensions at 72 °C for 2 minutes. 

Forty seven clones with inserts of the expected size have been partially 
sequenced. The sequences all show homology to Type I fungal polyketide synthase genes but 
are all distinct from each other and from known sequences. Seq. ID. No. 21 shows the com- 
plete DNA sequence of a 637 base pair product amplified from DNA extracted from the 

20 lichen Xanthoparmelia cumberlandia (clone Xa.cum.6A). The translated amino acid seq- 
uence is shown in Seq. ID. No. 22. The greatest homology as determined by Blast analysis is 
indicated to fungal Type I polyketide synthase genes. Sequence ID Nos. 29 and 30 show the 
DNA sequence and conceptual amino acid sequence, respectively, for a further clone 
Xa .cum.6H isolated in this experiment. Sequences of DNA and the corresponding amino 

25 acid sequences for seven other lichen samples, Leptogium corniculatum (Seq. ID Nos. 31-42), 
Parmelia sulcata (Seq. ID Nos. 43-50); Peltigera neopolydactyla (Seq. ID Nos. 51-60); 
Pseudocyphellaria anthrapsis (Seq. ID Nos. 61-62); Siphula ceratities (Seq. ID. Nos. 63-66); 
Thamnolia vermicularis (Seq. ID Nos. 67-68); and Usnea florida (Seq. ID Nos. 69-80). Each 
of these sequences showed homology by Blast analysis to fungal Type I polyketide synthase. 



30 
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EXAMPLE 6 

The experiment of Example 5 was repeated on DNA from the lichen Solorina 
crocea using the degenerate peptide synthetase primers of Example 3. Freshly collected 
lichen (approximately 1.2 g) was washed in running tap water to remove conspicuous soil and 
5 field detritis, and then further cleaned under a dissecting microscope. The cleaned sample 
was then gently shaken in a 50 ml tube containing about 40 ml of 0.2% SDS for at ieast 30 
minutes and rinsed thoroughly with water. Excess surface water was blotted from the 
washed, hydrated lichen, and the sample was frozen at -80°C for at least 15 minutes then 
vacuum dried at room temperature for 4 hours. The lichen was ground in liquid nitrogen 

10 using a mortar and pestle to produce a lichen powder for use in preparing DNA extracts. 

To prepare the DNA extracts, 0.28g of lichen powder was placed into 18 2-ml 
microftige tubes, and each aliquot was mixed with 1 .25 ml isolation buffer (150 mM EDTA, 
50 mM Tris pH 8, 1% sodium lauroyl sarcosine) and extracted for 1 hour at 62°C. The 
samples were centrifuged for three minutes to pellet cellular debris and a cloudy supernatant 

15 was decanted into new microftige tubes. Each sample of the supernate was mixed with 750 

Hi 7.5 M ammonium acetate, incubated on ice for 30 minutes and centrifuged for five minutes 
at 16,000 X g to precipitate proteins. The supernatant fluid was saved in new microfuge 
tubes and nucleic acids were precipitated with 0.6 volumes of isopropanol overnight at 4°C. 
Samples were centrifuged for five minutes at 16,000 X g to pellet nucleic acids. The pellets 

20 were dissolved in TE containing RNAse (1 8 ng total) at 50 °C for 45 minutes. The solutions 
were then extracted with an equal volume of TE saturated phenolxhloroform (1:1), and again 
with chloroform. DNA in the aqueous phase was precipitated with 0.1 M sodium acetate and 
two volumes of ethanol at -20°C for 2 hours, and then pelleted by centrifiigation for five 
minutes at 16,000 X g. The DNA pellet was washed with 75% ethanol, vacuum dried at 

25 room temperature for 3 minutes and then dissolved in TE. The final amount of DNA 
recovered was approximately 70p.g according to fluorometric measurement. 

Two clones containing the expected 1.2 kb insert were sequenced and found to 
contain the same sequence shown in Seq. ID. No. 23. Seq. ID. No. 24 shows the translated 
amino acid sequence. The sequence is distinct, with greatest homology as determined by 

30 Blast analysis to the peptide synthase module of the cyanobacterium Microcyctis aeruginosa. 
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EXAMPLE 7 

The experiment of example 4 was repeated using the Type II polyketide 
synthase primers given by Seq. ID. Nos. 5 and 6. Three starting samples were used for 
recovery of Type II polyketide synthase genes: two uncharacterized strains of Streptomyces 
5 (strains WEC 68 A and WEC 7 IB) which had been shown to contain Type II polyketide 
synthase genes, and a soil sample obtained from a forest area near Vancouver, British 
Columbia. The soil sample was prepared using the basic protocol from Holben et al, AppL 
Environ. Microbiol 54: 703- 711 (1988) with variations in parameters such as mix time to 
adjust for the individual characteristics of the soil samples. 

10 Streptomyces genomic DNA preparations suitable for PCR amplification were 

prepared from the mycelia harvested from a 50 ml culture in tryptic soy broth (Difco) which 
had been grown for 3 days at 300 C. The mycelia were collected by centrifugation at 2500 x 
g for 10 minutes, the pellets were washed in 10% v/v glycerol and the washed pellets were 
frozen at -200C. The size of the pellets will vary with different strains; for extraction, 1 g 

15 samples were suspended in 5 ml TE buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA) in a 50 
ml screw cap Oakridge tube and lysozyme (to 10 mg/ml) and RNase (to 40 ug/g) were added. 
Following incubation at 300C for 45 min. a drop of each suspension was transferred to a 
microscope slide, one drop of 10% SDS was added and the suspension was checked for 
complete clearing and increased viscosity, indicating lysis. Most strains lyse with this 

20 incubation time, but incubation in lysozyme may be continued if necessary. (For strains 

which are very resistant to lysis, small amounts of DNA suitable for PCR amplification may 
often be prepared on a FastPrep™ instrument as described below.) Following confirmation 
of sufficient incubation time in lysozyme, 1.2 ml of 0.5 M EDTA, pH 8.0 was added to the 
suspension and mixed gently then 0.13 ml of 10 mg/ml Proteinase K (Gibco/BRL) solution 

25 was added and incubated for 5 min. at 300 C. 0.7 ml of 10% SDS was added, mixed gently 
by tilting, then incubated again at 300 C for 2 hours. Following lysis, three successive 
phenol/chloroform extractions were performed by adding a volume equivalent to the aqueous 
phase each time of a 1 :1 mixture of ultrapure Tris buffer saturated phenol (Gibco/BRL) and 
chloroform. The aqueous phase was recovered each time following centrifugation at 2500 x 

30 g for 10 min. in a shortened (i.e.wide bore) Pasteur pipet to minimize shearing; DNA was 
precipitated from the final aqueous phase with the addition of 0.1 volume of 3M Na acetate, 
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pH 4.8 and 1 volume of isopropanol at room temperature. DNA was spooled from the 
solution onto a sealed Pasteur pipet, rinsed in ice cold 70% ethanol and solubilized in 0.5 ml 
TE buffer overnight at room temperature. DNA yields (as determined spectrophotomet- 
rically) typically range from 1 to 3 mg from 1 g of mycelia. 
5 An alternative method for the preparation of small amounts of Streptomyces 

DNA suitable for PCR amplification has been found to be useful for strains resistant to lysis 
or when a faster method is desirable. This method makes use of the FastPrep™ instrument 
(Savant) and the methods and kit supplied by BIO 101 (Bio/Can Scientific, Mississauga, 
Canada). A 2 ml aliquot from a 20 ml, 3 day culture in tryptic soy broth is pelleted in a 2 ml 

10 microfuge tube and the size of the mycelial pellet is estimated. "Small" pellets are 

resuspended in 100 ul of sterile distilled water; larger pellets are resuspended in 200-300 ul of 
water. 200 ul of suspension is transferred to a homogenization tube from the kit . Following 
the manufacturer's protocol for the preparation of DNA from medium hard tissue, the large 
bead is added to this tube (which already contains a small bead) and 1 ml of solution CLS-TC 

15 from the kit is added and the samples are processed in the instrument for 10 seconds at speed 
setting 4.5. Samples are then spun 15 min. at 10,000 x g at 40C and 600 ul of the supernatant 
is transferred to a clean microfuge tube, 400 ul of Binding Matrix is added and mixed gently, 
then the sample is spun for 1 min. as above. The supernatant is discarded while the pellet is 
resuspended in 500 ul SEWS-M and transferred to a SPIN™ Filter unit. This is spun for 1 

20 minute, the contents of the catch tube are discarded and the unit is spun again to dry. The 

filter unit is transferred to a new microfuge tube and DNA is eluted from the matrix in 100 ul 
DES which is left on the filter for 2-3 min. at room temperature. Eluted DNA is collected by 
spinning once again and this DNA is now ready to use in PCR amplifications. Due to 
components of the final solution, DNA prepared by this method is difficult to quantify. 

25 Typically 1 ul or 1/10 ul of this eluate is suitable as a template for PCR; 
larger quantities may be inhibitory to the PCR polymerase. 

PCR amplification was carried out in a total volume of 50 ul containing 50 ng 
of DNA, 5 % DMSO, 1.25 mM MgCl 2 , 200 uM of each deoxynucleotide triphosphate, 0.5 ug 
of each primer and 5.0 units of Taq polymerase (BRL Life Technologies, Gaithersburg, MD). 

30 The thermal cycling started with a 'touch-down' sequence, lowering the annealing tempera- 
ture from 65 °C to 58 °C over the course of 8 cycles. The temperature of the annealing step 
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was then maintained at 58°C for a further 35 cycles. The overall cycle used was: denatura- 
tion at 94°C for 45 seconds, annealing at 65°C to 58°C for 1 minute and extension at 72°C 
for 2 minutes. The size of the amplified fragments was expected to be approximately 1.5 kb. 

Amplification of the two Streptomyces strains produced DNA fragments of the 
5 expected size (1482 bp and 1538 bp). Open reading frame analysis of the two sequences 

revealed the presence of a set of three ORFs each, corresponding to the 3 ! -ends of the putative 
Ks a -subunit genes (50 to 60 bp), possible full-length Ks p genes (approx. 1.2 kb) and the first 
halves of potential ACP genes (approx 100 bp). In each sequence, the first and second ORFs 
were linked by a stop codon overlap typical of Ks a(J gene pair junctions and a possible 

10 indication of tight coexpression through translational coupling. The two Ks p genes were 
separated from the downstream ACP genes by a short spacer, again consistent with the 
expected gene organization. 

Two clones were selected from among clones created using the soil DNA as a 
source which were found to produce 1.5 kb inserts. These inserts were sequenced and found 

15 to exhibit similarity to known KS p genes with three ORFs as described above. The translated 
amino acid sequences of the four genes are shown in Sequence ID Nos 25 to 28. 

The four putative KS P genes had G+C content over 70% which is typical for 
the coding regions of Actinomycete genes. Results of data base searches established that the 
deduced products of all four ORFs were similar to known KS P gene products from Type II 

20 polyketide synthases but they did not match any known sequences. 



EXAMPLE 8 

DNA can be extracted from large volumes of soil in accordance with the 
following procedure. Place dry soil into a sterile blender with 0.2% sodium pyrophosphate 

25 (100 ml/100 grams of soil). The pH of the sodium pyrophosphate solution should be about 
10, although some variation to account for the characteristics of the soil may be appropriate. 
The mixture is blended for 30 seconds, decanted into centrifuges bottles and then centrifuged 
for 15 minutes at 100 X g at 4°C. The supernatant is decanted, filtered two times through 
cheese cloth and saved. The pelleted soil is extracted an additional two times using the same 

30 procedure. 
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After the extractions, the pooled supematants are centrifuged for 15 minutes at 
10,500 X g and the pellets are collected. The pellet may be incubated for 6 hours at 55 °C in 
pre-germination medium (0.5% w/v yeast extract (Difco), 0.5% w/v casamino acids (Difco) 
with 0,005 M CaCl 2 and 0.025 M TES, pH 8.0 (added separately from sterile stock after 
5 autoclaving other components)) and then repelleted, or it may be used directly. In either case, 
the pellet (approximately 30-200 mg) is mixed with 5 ml IX TE (pH 8.0), 500 p.1 0.5M 
EDTA (pH 8.6) and 500 \il - 20 mg/ml lysozyme in IX TE (pH 8.0) and incubated for 30 
minutes at 37°C. 500 \il of 20% SDS and 100 fjtl - 1% proteinase K in TE and 1% SDS are 
then added and the mixture is vortexed gently before incubating for 60 minutes at 55 °C or 

10 overnight at 37°C, 

The incubated mixture is combined with 1 0 ml 20% polyvinylpyrrolidone 
(avg. MW=40,000) and incubated for 10 minutes at 70°C. One-half volume of 7.5 M 
ammonium acetate (stored at -20 °C) is then added, the resulting mixture is placed for 10 
minutes on a low speed shaker, and then centrifuged for 20 minutes at 18,5000 X g. The 

1 5 supernatant is combined with 1 volume of isopropanol and incubated for 30 minutes at -20 °C 
before centrifuging for 20 minutes at 18,500 X g. The pellet from this centrifugation is 
washed in 70% ethanol, and centrifuged for 10 minutes at 18,500 X g. The pellet from this 
final centrifugation is collected and air dried. 

20 EXAMPLE 9 

To extract DNA from small amounts of soil the following procedure can be 
used. Combine soil (approx 1 g) with 1 ml distilled water, vortex to suspend and pellet at 
19,000 X g for 5 minutes. After removing the supernatant, freeze/thaw the samples twice by 
either of the following techniques (a) -20°C freezer, 30 minutes, followed by 50-60°C water 

25 bath (2 minutes), repeated 2 times; or (b) quick freeze in EtOH-dry ice bath (dip in until 

frozen, approx one minute) followed by 60 °C water bath (2 minutes), repeated 2 times. The 
pellets are then suspended in 350 \xl TE buffer (pH 8.0), 50 |al 0.5 M EDTA and 50 jj.1-20 
mg/ml lysozyme in TE buffer, vortexed and incubated at 37°C for 30 minutes in a water bath. 
50 \il of 20% SDS and 10 ^1 1% Proteinase K/ 1% SDS in TE buffer is added, vortexed, and 

30 incubated for one hour at 55 °C or overnight at 37°C. One-tenth volume of 20% 

polyvinylpyrrolidone (avg. MW=40,000) is then added and incubated at 70 °C for 10 minutes. 
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One-half volume of 7.5 M ammonium acetate (stored at -20°C) is added, the tubes are shaken 
at low speed for ten minutes and then centrifiiged at 19,000 X g for 20 minutes. The 
supernatant is collected using pipets with cut tips to avoid shearing DNA, combined with one 
volume of isopropanol, mixed gently, and stored at -20°C for 30 minutes or 4°C overnight. 
5 The DNA is then collected as a pellet by centrifugation at 19,000 X g for 10 minutes. The 
resulting pellet is washed with 0.5 ml of 70% ethanol (stored at -20°C) and then air or 
vacuum dried. The dried DNA is then dissolved in 50-150 ul of TE buffer, incubated at 4°C 
for one hour and then heated to 60°C for 10 minutes to facilitate dissolving DNA. The 
resulting solutions are stored at -20 °C until use. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Terragen Diversity Inc. 

(ii) TITLE OF INVENTION: METHOD FOR ISOLATION OF BIOSYNTHESIS 
GENES FOR BIOACTIVE MOLECULES 

(iii) NUMBER OF SEQUENCES: 94 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Deeth Williams Wall 

(B) STREET: National Bank Building, 150 York Street, Suite 400 

(C) CITY: Toronto 

(D) STATE: Ontario 

(E) COUNTRY: Canada 

(F) ZIP : M5H 3S5 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.5 inch, 1.44 Mb 

(B) COMPUTER: Dell (IBM Compatible) 

(C) OPERATING SYSTEM: Windows 95 

(D) SOFTWARE: Word 97 

(vi) CURRENT APPLICATION DATA : 

(A) APPLICATION NUMBER: Not yet assigned 

(B) FILING DATE: May 21, 1998 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/861,774 

(B) FILING DATE: May 22, 1997 

(viii) ATTORNEY /AGENT INFORMATION : 

(A) NAME: Eileen McMahon 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 1694/0005 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 416-941-9440 

(B) TELEFAX: 416-941-9443 
<C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GCSRTSGACC CGCAGCGCGC 20 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL : no 

(iv) ANT I- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
GATSRCGTCC GCRTTSGTSC C 21 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
CTSACSKSGG SCGNACSGCS ACSCG 25 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
GTTSACSGCG TAGAACCASG CGAA 25 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
TTCGGSGGNT TCCAGWSNGC SATG 24 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 
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(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: 
TCSAKSAGSG CSANSGASTC GTANCC 2 6 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGBTCSGGST TYTTCTACGC 2 0 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CCTSGGTCTG GWASAGSACG 20 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ATCTACACST CSGGCACSAC SGGCAAGCCS AAGGG 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 26 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
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AWNGAGKSNC CICCSRRSNM GAAGAA 2 6 

(2) INFORMATION FOR SEQ ID NO:ll: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: yes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
MGIGARGCIY TIGCIATGGA YCCICARCAR MG 32 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGRTCNCCIA RYTGIGTICC IGTICCRTGI GC 32 

(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1206 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

GCGGTGGACC CGCAGCAGCG CCTCATGCTG GAGCTGGCCT GGTCCGCGCT 50 

GGAAAGCGCA GGTCATCCGC CCTCGATATT CCCCGGCCTG ATCGGGGTCT 100 

ATGTCGGCAT GAACTGGAAT CGCTATCGCG CGAATTGCAT TTCTGCACAC 150 

CCTGATGTGG TGGAGCGATT CGGTGAATTG AACACAGCGC TCGCCAACGA 2 00 

ATACGACTTT CTTGCTACCC GAATCTCCTA CAAGCTCAAT CTGCGCGGTC 2 50 

CCAGCGTCAC TATCAGCACC GCTTGTTCGA CTTCCCTGGT TGCCATTGCT 3 00 

CAGGCTTCGC AGGCGTTGCT CAACTATGAA TGCGACATTG CTTTGGCTGG 3 50 

GGTTGCCTCC ATAACCGTGC CTGTCAATGC AGGCTACCTC TACCAAGAAA 4 00 

GGTGGCATGC TTTCACCGAA GGGGATTGTC CTACATTCGA TGCCCCAGCA 450 

CGGGACCACT TCAATGATGC CCCCTGTCTC CTTTTTGCGG GCCTGGAAAA 500 

CCCATCCAGG AGGGGGGGGG GGGCCCTCAT ACCCGGCCTT TCAAGCGGGA 550 

ACCTCTCACA GGAAGCGGAT GTTTCAGCCG AAGGGATGTT GAACATTGAC 600 

GCCGGCAGCA CGGGGGACAA GTTCAGGGAT GGGCGCGCTT TTGTTGTATG 650 

GGGGGGGCCT GGAAGAAGCA TTCAAGGGAC GGTGATCAAA CTTAACCCCT 700 

TCATTGGCGG GTTTGCCGCG GAACAAGGAC GGGTTCGGAC AAGGCGAGTT 750 

TACCGGCGCC CAGGCGTCAA TGGTCAGGGC GGAGTTCATT TCGCTTTGGC 800 
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GGTGGAGTTT GCGGGATATT CGAATCCCGC AAGCATCGGG ATTTCATTCG 850 

AAAACCCACG GGCACGGGCG ACGCCATTGG GCGATCCGAT AGAAGTGGCC 900 

GCGCTAAAGA TGGTTTTTCG CCGACGCTCG TTCCAGAGGC GCCGTTGCGC 950 

CCTTGGATCG GTCAAGAGTT GTGTCGGACA CCTGGTTCAC GCCGCCGGCG 1000 

TGACCGGATT TATCAAGGCT GTCTTGTCGG TCTACCACGG CAAGATCGCA 1050 

CCGACACTGT TTTTCGAGAA AGCAAATCCG AGGCTCGGGC TGGAAGACAG 1100 

TCCTTTCTAT GTCAATGCCG GACTCGAGAA GTGGACGGCC GCCGAGCAGC 1150 

CACGCCGCGC GGGGGTCAGT GCTTTCGGGG TCGGTGGCAC CAATGCGCAC 12 00 
GCGATC 12 06 

(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 02 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

Ala Val Asp Pro Gin Gin Arg Leu Met Leu Glu Leu Ala Trp Ser 

5 10 15 

Ala Leu Glu Ser Ala Gly His Pro Pro Ser lie Phe Pro Gly Leu 

20 25 30 

lie Gly Val Tyr Val Gly Met Asn Trp Asn Arg Tyr Arg Ala Asn 

35 40 45 

Cys lie Ser Ala His Pro Asp Val Val Glu Arg Phe Gly Glu Leu 

50 55 60 

Asn Thr Ala Leu Ala Asn Glu Tyr Asp Phe Leu Ala Thr Arg lie 

65 70 75 

Ser Tyr Lys Leu Asn Leu Arg Gly Pro Ser Val Thr lie Ser Thr 

80 85 90 

Ala Cys Ser Thr Ser Leu Val Ala He Ala Gin Ala Ser Gin Ala 

95 100 105 

Leu Leu Asn Tyr Glu Cys Asp He Ala Leu Ala Gly Val Ala Ser 

110 115 120 

He Thr Val Pro Val Asn Ala Gly Tyr Leu Tyr Gin Glu Arg Trp 

125 130 135 

His Ala Phe Thr Glu Gly His Cys Pro Thr Phe Asp Ala Pro Ala 

140 145 150 

Arg Asp His Phe Asn Asp Ala Pro Cys Leu Leu Phe Ala Gly Leu 

155 160 165 



Glu Asn Pro Ser Arg Arg Gly Gly Gly Ala Leu He Pro Gly Leu 
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170 



175 



180 



Ser Ser Gly Asn Leu Ser Gin Glu Ala Asp Val Ser Ala Glu Gly 

185 190 195 

Met Leu Asn lie Asp Ala Gly Ser Thr Gly Asp Lys Phe Arg Asp 

200 205 210 

Gly Arg Ala Phe Val Val Trp Gly Gly Pro Gly Arg Ser lie Gin 

215 220 225 

Gly Thr Val lie Lys Leu Asn Pro Phe lie Gly Gly Phe Ala Ala 

230 235 240 

Glu Gin Gly Arg Val Arg Thr Arg. Arg Val Tyr Arg Arg Pro Gly 

245 250 255 

Val Asn Gly Gin Gly Gly Val His Phe Ala Leu Ala Val Glu Phe 

260 265 270 

Ala Gly Tyr Ser Asn Pro Ala Ser lie Gly lie Ser Phe Glu Asn 

275 280 285 

Pro Arg Ala Arg Ala Thr Pro Leu Gly Asp Pro lie Glu Val Ala 
290 295 300 

Ala Leu Lys Met Val Phe Arg Arg Arg Ser Phe Gin Arg Arg Arg 

305 310 315 

Cys Ala Leu Gly Ser Val Lys Ser Cys Val Gly His Leu Val His 

320 325 330 

Ala Ala Gly Val Thr Gly Phe lie Lys Ala Val Leu Ser Val Tyr 

335 340 345 

His Gly Lys lie Ala Pro Thr Leu Phe Phe Glu Lys Ala Asn Pro 

350 355 360 

Arg Leu Gly Leu Glu Asp Ser Pro Phe Tyr Val Asn Ala Gly Leu 

365 370 375 

Glu Lys Trp Thr Ala Ala Glu Gin Pro Arg Arg Ala Gly Val Ser 

380 385 390 

Ala Phe Gly Val Gly Gly Thr Asn Ala His Ala He 



(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 



395 



400 
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(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGCTCCGGGT TTTTCTACGC GTCCAACCAC GGGATCGACG TCACGCGGGT 50 

GCGCGACGAG GTGAACAAGT TCCACGCCGA GATGACGCCC GGGGAGAAGT 100 

TCGAGCTGGC CATCAACGCC TACAACGACG CGAATCCGCA TACCCGCAAC 150 

GGGTATTACA TGGCCGTCGA AGGCAAGAAG GCCGTCGAGT CCTTCTGCTA 2 00 

CCTCAACCCG GCCTTCACCC CCGAGCACCC GATGATCGAG GCGGGCGCGG 250 

CGGGGCACGA GGTGAACAAC TGGCCGGACG AGGCTCGCCA CCCCGGCTTC 3 00 

CGTGAGTACG GGGGAGCAGT ACTTCGAAGA GGATCCTCCG ACCTGTCACT 350 

GGTGCTGCTG CGTGGGTACG CGCTGGCCCT GGGCAAGGAC GAGAACTACT 40 0 

TCGACGACTA CGTCAAGCAC TCCGACACGC TCTCGGCCGT CTCGCTGATC 450 

CGTTACCCGT ACCTGGAGAA CTACCCGCCG GTGAAGACCG GTCCGGACGG 50 0 

CGAGAAGCTC AGCTTCGAGG ATCACTTCGA CGTCTCGCTG ATCACCGTGC 550 

TCTTCCAGAC CCAGG 565 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Gly Ser Gly Phe Phe Tyr Ala Ser Asn His Gly lie Asp Val Thr 

5 10 15 

Arg Val Arg Asp Glu Val Asn Lys Phe His Ala Glu Met Thr Pro 

20 25 30 

Gly Glu Lys Phe Glu Leu Ala lie Asn Ala Tyr Asn Asp Ala Asn 

35 40 45 

Pro His Thr Arg Asn Gly Tyr Tyr Met Ala Val Glu Gly Lys Lys 

50 55 60 

Ala Val Glu Ser Phe Cys Tyr Leu Asn Pro Ala Phe Thr Pro Glu 

65 70 75 

His Pro Met lie Glu Ala Gly Ala Ala Gly His Glu Val Asn Asn 

80 85 90 

Trp Pro Asp Glu Ala Arg His Pro Gly Phe Arg Glu Tyr Gly Gly 

95 100 105 

Ala Val Leu Arg Arg Gly Ser Ser Asp Leu Ser Leu Val Leu Leu 

110 115 120 

Arg Gly Tyr Ala Leu Ala Leu Gly Lys Asp Glu Asn Tyr Phe Asp 

125 130 135 

Asp Tyr Val Lys His Ser Asp Thr Leu Ser Ala Val Ser Leu lie 
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140 145 150 

Arg Tyr Pro Tyr Leu Glu Asn Tyr Pro Pro Val Lys Thr Gly Pro 

155 160 165 

Asp Gly Glu Lys Leu Ser Phe Glu Asp His Phe Asp Val Ser Leu 

170 175 180 

lie Thr Val Leu Phe Gin Thr Gin 

185 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1172 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
"(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

AAGGAGGGGC CGCCCGGGGC GAAGAAGCTG TCCGTCCGAC TGACACGTTC 50 

CACTCCGAGG AGCCCGGACC AGATGCGCGC CAGCTTTACC TCGACCGGCG 100 

TAGATGGCGG GTCGTAGTCA GTGCGATCCG ATGAGTCATC TGGAGGTGCA 150 

GGCAGCACCT TCAGATCGAT CTTGCCGCTC GCCATGCGCG GCATCTCGCG 2 00 

GAGCTCGACG AATGCAGCCG GAATCATGTA CTCGGGCAAC CGCGTGCGAA 2 50 

GATGATCGCG CAGCTCGGAC GCGGCGACCG AGGCGAGCCG AGGCGACCAG 3 00 

TACGCAACGA GACGCTTGTC GCCGGCCCGC TCCTGCCGCG CCAGGACGAC 3 50 

GGCCGTCTCG ACACCGGGGT GATCGGCCAG CGCCGCCTCG ATCTCACCGA 4 00 

GCTCGATGCG GAAGCCGCGG ATCTTGACCT GATGATCCGC GCGCCCGATG 450 

AAGTCGAGGT TGCCGTCCGG AAGCCAGCGC ACCAGGTCGC CGGTCCGGTA 500 

CAGCCGCGAG CCAGGTGCAC CGAATGGATC GGGTACGAAC CGCGCTCCGG 550 

TGAGGGCGGC ATCATCGACA TAGCCGCGCG CGAGGTTCTC GCCACCGATG 600 

TACAGCTCGC CGATCACGCG CGCCGGAACG GGCTCGAGTG CGCTATCGAG 650 

CACGTAGACC TGAACGTTGT CGAGCGGACG GCCGATCGAC GGCAGCTCGG 700 

ACCCGTGTTC GGACGCGGGC GACACGATCG CCCACGTCGT ATCGACCGCG 750 

TTCTCCGTCG GGCCGTACTC GTTGAGCATG CGGTAGTGCG CATCGCGCGG 8 00 

TGGACGCCGC GTGAGTCGAT CACCGCCCGT ACGCAGCACG CGCAACGAGC 850 

GTGGAAAGTC GCCAGCCGCG AGCAACGCGT CGAGTAGCCG GCCTGGAAGA 900 

TCGGAGATCG TGATCCCCCA TCGCGTCAGG TTCTCGAGCA GGCGCGGCGG 950 

ATCGAGGCGG AGCTCGTTGT CCACCAGATG AAGCCGGGCG CCCGTCGCCA 1000 

GCGTGGACCA CAGCTCGAGC GCCGCGGCAT CGAACGACAT CGAGTAGATC 1050 

TGCGTCACGC GGTCGTCGGC ACTGATCTCG ACGGCACGCT GGTTCCACGC 1100 

GATCAAATTT CTCAGTGCAC GGTGCGGCAC GGCGACGCCC TTCGGCTTGC 1150 

CCGTCGTGCC CGACGTGTAG AT 1172 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 90 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

He Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Ala Val 

5 10 15 

Pro His Arg Ala Leu Arg Asn Leu He Ala Trp Asn Gin Arg Ala 

20 25 30 

Val Glu He Ser Ala Asp Asp Arg Val Thr Gin He Tyr Ser Met 

35 40 45 

Ser Phe Asp Ala Ala Ala Leu Glu Leu Trp Ser Thr Leu Ala Thr 

50 55 60 

Gly Ala Arg Leu His Leu Val Asp Asn Glu Leu Arg Leu Asp Pro 

65 70 75 

Pro Arg Leu Leu Glu Asn Leu Thr Arg Trp Gly He Thr He Ser 

80 85 90 

Asp Leu Pro Gly Arg Leu Leu Asp Ala Leu Leu Ala Ala Gly Asp 

95 100 105 

Phe Pro Arg Ser Leu Arg Val Leu Arg Thr Gly Gly Asp Arg Leu 

110 115 120 

Thr Arg Arg Pro Pro Arg Asp Ala His Tyr Arg Met Leu Asn Glu 

125 130 135 

Tyr Gly Pro Thr Glu Asn Ala Val Asp Thr Thr Trp Ala He Val 

140 145 150 

Ser Pro Ala Ser Glu His Gly Ser Glu Leu Pro Ser lie Gly Arg 

155 160 165 

Pro Leu Asp Asn Val Gin Val Tyr Val Leu Asp Ser Ala Leu Glu 

170 175 180 

Pro Val Pro Ala Arg Val He Gly Glu Leu Tyr He Gly Gly Glu 

185 190 195 

Asn Leu Ala Arg Gly Tyr Val Asp Asp Ala Ala Leu Thr Gly Ala 

200 205 210 

Arg Phe Val Pro Asp Pro Phe Gly Ala Pro Gly Ser Arg Leu Tyr 

215 220 225 

Arg Thr Gly Asp Leu Val Arg Trp Leu Pro Asp Gly Asn Leu Asp 

230 235 240 

Phe He Gly Arg Ala Asp His Gin Val Lys He Arg Gly Phe Arg 

245 250 255 

He Glu Leu Gly Glu He Glu Ala Ala Leu Ala Asp His Pro Gly 
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260 265 270 

Val Glu Thr Ala Val Val Leu Ala Arg Gin Glu Arg Ala Gly Asp 

275 280 285 

Lys Arg Leu Val Ala Tyr Trp Ser Pro Arg Leu Ala Ser Val Ala 

290 295 300 

Ala Ser Glu Leu Arg Asp His Leu Arg Thr Arg Leu Pro Glu Tyr 

305 310 315 

Met lie Pro Ala Ala Phe Val Glu Leu Arg Glu Met Pro Arg Met 

320 325 330 

Ala Ser Gly Lys lie Asp Leu Lys Val Leu Pro Ala Pro Pro Asp 

335 340 345 

Asp Ser Ser Asp Arg Thr Asp Tyr Asp Pro Pro Ser Thr Pro Val 

350 355 360 

Glu Val Lys Leu Ala Arg lie Trp Ser Gly Leu Leu Gly Val Glu 

365 370 375 

Arg Val Ser Arg Thr Asp Ser Phe Phe Ala Pro Gly Gly Pro Ser 

380 385 390 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 19: 

TTCGGCGGGT TCCAGACGGC CATGGTGCTG ACGACGGGAC GGGACAATGA 50 

GAAGTAGCGT CGCGGTCACC GGCATCGGCC TGGTGGCCGC CAACGGGCTC 100 

ACCACCGAGG ACGTGTGGTC GGCCGTGCTC GGCGGCCGCA GCGGCCTTGG 150 

AACGATCACC CGTTTCGACG CCGCGGGCTA CCCGGCCCGG ATCGCCGGCG 2 00 

AGGTGTCGCA GTTCGTGGCC GAGGAGCACA TCGCCGACCG GCTGATCCCG 250 

CAGACCGACC ACATGACCCG GCTGGCGCTG GCCGCGGCCG AGTCGGCGAT 3 00 

CCGGGACGCC AAGGTGGGAC CTGGCCGAGC TGCCCGATTC GGCGCGGGCG 350 

TGGTCACCGC CGCGACGGCA GGCGGCTTCG AGTTCGGCCA GCGGGAGCTG 40 0 

GAGAACCTGT GGCGCAAGGG GCCTGAGCAC GTCAGCCCCT ACCAGTCCTT 45 0 
CGCCTGGTTC TACGCCGTCA AC 4 72 



(2) INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Arg Ser Ser Val Ala Val Thr Gly He Gly Leu Val Ala Ala 

5 10 15 

Asn Gly Leu Thr Thr Glu Asp Val Trp Ser Ala Val Leu Gly Gly 

20 25 30 

Arg Ser Gly Leu Gly Thr He Thr Arg Phe Asp Ala Ala Gly Tyr 

35 40 45 

Pro Ala Arg He Ala Gly Glu Val Ser Gin Phe Val Ala Glu Glu 

50 55 60 

His He Ala Asp Arg Leu He Pro Gin Thr Asp His Met Thr Arg 

65 70 75 

Leu Ala Leu Ala Ala Ala Glu Ser Ala He Arg Asp Ala Lys Val 

80 85 90 

Gly Pro Gly Arg Ala Ala Arg Phe Gly Ala Gly Val Val Thr Ala 

95 100 105 

Ala Thr Ala Gly Gly Phe Glu Phe Gly Gin Arg Glu Leu Glu Asn 

110 115 120 

Leu Trp Arg Lys Gly Pro Glu His Val Ser Pro Tyr Gin Ser Phe 

125 130 135 

Ala Trp Phe Tyr Ala Val Asn 

140 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 637 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TATATTACTC CAGGTTGCTT ACGAAGCATT GGAGATGTCC GGATATTTCG 50 
CCGATTCGTC CAGGCCTGAG GATGTCGGTT GCTATATTGG AGCTTGTGCA 100 
ACAGATTACG ATTTCAACGT AGCATCCCAT CCTCCCACGG CGTATTCAGC 150 
GACTGGCACG CTCCGATCTT TTCTAAGTGG CAAGCTGTCG CATTACTTTG 2 00 
GTTGGTCCGG TCCCTCTCTT GTC CTAGACA CTGCCTGCTC TTCGTCGGCG 2 50 
GTGGCTATTC ATACTGCATG TACTGCTTTG AGGACTGGCC AGTGTTCTCA 3 00 
AGCTCTAGCA GGCGGGATCA CGTTGATGAC AAGCCCGTAT CTCTATGAGA 3 50 
ACTTCTCTGC AGCCCATTTC TTGAGTCCAA CGGGAGGTTC AAAGCCGTTC 400 
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AGCGCAGRTG CAGATGGATA CTGTAGAGGA GAAGGTGGTG GCCTCGTGGT 450 

CTTGAAACGA CTTTCAGATG CTCTCAGGGA TGATGACCAT ATTATTAGTG 5 00 

TCATCGCTGG CTCGGCGGTC AACCAGAACG ACAACTGCGT GCCTATCACC 550 

GTCCCTCACA CTTCGTCTCA GGGAAATCTC TATGAACGAG TTACCAGACA 600 

GGCAGGGGTG ACACCCAATA AAGTCACTTT TGTGGAA 63 7 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

lie Leu Leu Gin Val Ala Tyr Glu Ala Leu Glu Met Ser Gly Tyr 

5 10 15 

Phe Ala Asp Ser Ser Arg Pro Glu Asp Val Gly Cys Tyr lie Gly 

20 25 30 

Ala Cys Ala Thr Asp Tyr Asp Phe Asn Val Ala Ser His Pro Pro 

35 40 45 

Thr Ala Tyr Ser Ala Thr Gly Thr Leu Arg Ser Phe Leu Ser Gly 

50 55 60 

Lys Leu Ser His Tyr Phe Gly Trp Ser Gly Pro Ser Leu Val Leu 

65 70 75 

Asp Thr Ala Cys Ser Ser Ser Ala Val Ala lie His Thr Ala Cys 

80 85 90 

Thr Ala Leu Arg Thr Gly Gin Cys Ser Gin Ala Leu Ala Gly Gly 

95 100 105 

lie Thr Leu Met Thr Ser Pro Tyr Leu Tyr Glu Asn Phe Ser Ala 

110 115 120 

Ala His Phe Leu Ser Pro Thr Gly Gly Ser Lys Pro Phe Ser Ala 

125 130 135 

Xaa Ala Asp Gly Tyr Cys Arg Gly Glu Gly Gly Gly Leu Val Val 

140 145 150 

Leu Lys Arg Leu Ser Asp Ala Leu Arg Asp Asp Asp His lie lie 

155 160 165 

Ser Val lie Ala Gly Ser Ala Val Asn Gin Asn Asp Asn Cys Val 

170 175 180 

Pro lie Thr Val Pro His Thr Ser Ser Gin Gly Asn Leu Tyr Glu 

185 190 195 
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Arg Val Thr Arg Gin Ala Gly Val Thr Pro Asn Lys Val Thr Phe 

200 205 210 

Val Glu 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1177 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GCACGACGGG CAAGCCCAAG GGGGGCGATG AACAGCCATC GAGGAATTTG 50 
CAATCGCTTA CTGTGGATGC AAGATGCTTA CAAACTAACT GAAACTGATC 100 
GCGTTCTGCA AAAAACGCCT TTTAGTTTCG ACGTTTCCGT TTGGGAGTTT 150 
TTCTGGCCTC TCTTGACAGG GGCGCGTTTA GTGATGGCTC AACCAGGCGG 2 00 
ACAGCGAGAT GCAACTTACT TAAT TAAC AC CATCGTCCAA GAGGAAATTA 2 50 
CAACACTGCA TTTTGTCCCC TCCATGTTGC GGATATTTCT CCAAACTAAA 3 00 
GGGCTAGAAC GTTGTCAATC TCTAAAACGG GTGTTTTGTA GTGGAGAAGC 3 50 
CTTACCAGTT GACCTCCAGG AGCGGTTTTT TGACTCGATG GGATGTGAAC 400 
TACACAACCT CTATGGTCCT ACCGAAGCGG CAATTGATGT CACATTTTGG 450 
CAGTGTCAAA GAGAGAGTAA CTTAAAAAGT GTACCGATTG GGAGAGCGAT 500 
CGCCAACACT CAAMTTTATA TCCTCGACTC CCATTTACAA GCAGTTCCCT 550 
TGGGTGCGAT CGGCGAACTT TATATTGGTG GTATCGGCGT TGCTAGAGGS 600 
TATCTTAACC GTCCAGACTT AACAGCCGAG CGATTTATTT CCCATCCCTT 650 
TAAGGAAGGC GRRAAACTTT ACAAAACAGG AGACTTAGCC CGATATCTGG 700 
CCGATGGCAA TATCGAATAC ATCGGTAGAA TTGATCATCA AGTAAAAATT 750 
CGGGGTTTCC GCATCGAACT TGGAGAAATC GAAACTTTAC TAGCACAACA 8 00 
CCCGACCATA CAGCAAACTG TCGTCACAGC TAGAATTGAT CATCTCGAAA 8 50 
ACCAGCGATT AGTCGCCTAC ATCGTTCCTC ATTCAGAGCA GACACTAACC 900 
ACAGACGAAC TGCGCCACTT CCTCAAAAAG AAACTGCCAG AATATATGGT 950 
GCCTAGTACT TTCGTTTTCC TAGACACTCT ACCCCTAACC CCCAACGGCA 1000 
AAATTGACCG TCGCGCTTTA CCAGCACCCG ACTCAACAAG GCTTGATTCA 1050 
GAAAACACAT ATCTTGCTCC CCGCGATTAA TTAGAATTTC AGTTGACTAA 1100 
AATTTGGTCA GAAATTTTAG GTATCCAGCC TATCGGTGTC AGGGACAACT 1150 
TCTTCTTCCT TGGGCGGCCC CTCCCTT 1177 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 92 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Ala Arg Arg Ala Ser Pro Arg Gly Ala Met Asn Ser His Arg Gly 

5 10 15 
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lie Cys Asn 
Glu Thr Asp 
Ser Val Trp 
Val Met Ala 
Asn Thr lie 
Ser Met Leu 
Gin Ser Leu 
Asp Leu Gin 
Asn Leu Tyr 
Gin Cys Gin 
Ala He Ala 
Ala Val Pro 
Gly Val Ala 
Arg Phe He 
Thr Gly Asp 
He Gly Arg 
Glu Leu Gly 
Gin Gin Thr 
Arg Leu Val 



Arg Leu Leu 
20 

Arg Val Leu 
35 

Glu Phe Phe 
50 

Gin Pro Gly 
65 

Val Gin Glu 
80 

Arg He Phe 
95 

Lys Arg Val 
110 

Glu Arg Phe 
125 

Gly Pro Thr 
140 

Arg Glu Ser 
155 

Asn Thr Gin 
170 

Leu Gly Ala 
185 

Arg Gly Tyr 
200 

Ser His Pro 
215 

Leu Ala Arg 
230 

He Asp His 
245 

Glu He Glu 
260 

Val Val Thr 
275 

Ala Tyr He 
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Trp Met Gin 

Gin Lys Thr 
Trp Pro Leu 
Gly Gin Arg 
Glu He Thr 
Leu Gin Thr 
Phe Cys Ser 
Phe Asp Ser 
Glu Ala Ala 
Asn Leu Lys 
Xaa Tyr He 
He Gly Glu 
Leu Asn Arg 
Phe Lys Glu 
Tyr Leu Ala 
Gin Val Lys 
Thr Leu Leu 
Ala Arg He 
Val Pro His 



Asp Ala Tyr 
25 

Pro Phe Ser 
40 

Leu Thr Gly 
55 



Asp Ala Thr 
70 

Thr Leu His 
85 

Lys Gly Leu 
100 

Gly Glu Ala 
115 

Met Gly Cys 
130 

He Asp Val 
145 

Ser Val Pro 
160 

Leu Asp Ser 
175 

Leu Tyr He 
190 

Pro Asp Leu 
205 

Gly Gly Lys 
220 



Asp Gly Asn 
235 



lie Arg Gly 
250 



Ala Gin His 
265 

Asp His Leu 
280 

Ser Glu Gin 



Lys Leu Thr 
30 

Phe Asp Val 
45 

Ala Arg Leu 
60 

Tyr Leu He 
75 

Phe Val Pro 
90 

Glu Arg Cys 
105 

Leu Pro Val 
120 

Glu Leu His 
135 

Thr Phe Trp 
150 

He Gly Arg 
165 

His Leu Gin 
180 

Gly Gly He 
195 

Thr Ala Glu 
210 

Leu Tyr Lys 
225 

He Glu Tyr 
240 

Phe Arg He 
255 

Pro Thr He 
270 

Glu Asn Gin 
285 

Thr Leu Thr 
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290 295 300 

Thr Asp Glu Leu Arg His Phe Leu Lys Lys Lys Leu Pro Glu Tyr 

305 310 315 

Met Val Pro Ser Thr Phe Val Phe Leu Asp Thr Leu Pro Leu Thr 

320 325 330 

Pro Asn Gly Lys lie Asp Arg Arg Ala Leu Pro Ala Pro Asp Ser 

335 340 345 

Thr Arg Leu Asp Ser Glu Asn Thr Tyr Leu Ala Pro Arg Asp Xaa 

350 355 360 

Leu Glu Phe Gin Leu Thr Lys lie Trp Ser Glu lie Leu Gly lie 

365 370 375 

Gin Pro lie Gly Val Arg Asp Asn Phe Phe Phe Leu Gly Arg Pro 

380 385 390 

Leu Pro 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 406 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Ser lie Arg Thr Val Val Thr Gly Leu Gly lie Ala Ala Pro 

5 10 15 

Asn Gly Leu Gly lie Glu Glu Tyr Trp Ser Ala Thr Leu Ala Gly 

20 25 30 

Arg Gly Ala lie Gly Pro Leu Thr Arg Phe Asp Ala Ser Ser Tyr 

35 40 45 

Pro Ser Arg Leu Ala Gly Glu lie Arg Gly Phe Thr Ala Ala Glu 

50 55 60 

His Leu Pro Gly Arg Leu Leu Pro Gin Thr Asp Arg Met Thr Gin 

65 70 75 

Leu Ala Leu Val Ser Ala Gly Trp Ala Leu Asp Asp Ala Gly Val 

80 85 90 



Val Pro Asp Glu Leu Pro Ala Tyr Asp Met Gly Val lie Thr Ala 

95 100 105 
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Ser His Ala 
Leu Trp Ser 
Ala Trp Phe 
Gly Met Arg 
Gly Leu Asp 
Thr Pro Leu 
Trp Gly Trp 
Asp Asp Pro 
Gly His Val 
Ala Glu His 
He Thr Gly 
Arg Gly Pro 
Ala Gly Thr 
Ala Ala Leu 
Lys Val Phe 
Met Thr Gly 
Ala Ala Cys 
His Ser Ser 
Ala Pro Arg 



Gly Gly Phe 
110 

Lys Gly Gly 
125 

Tyr Ala Val 
140 

Gly Pro Ser 
155 

Ala Leu Ala 
170 

He Val Ser 
185 

Val Ala Gin 
200 

Gly His Ala 
215 

Pro Gly Glu 
230 

Ala Arg Ser 
245 

His Ala Ser 
260 

Ala Val Gin 
275 

Val Pro Asp 
290 

Pro Glu Leu 
305 

Gly Pro His 
320 

Arg Leu Tyr 
335 

Leu Ala He 
350 

Leu Ser Gly 
365 

Thr Ala Pro 
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Glu Phe Gly 

Lys Tyr Val 
Asn Ser Gly 
Gly Val Val 
Gin Ala Arg 
Gly Ala Val 
Leu Ala Gly 
Tyr Val Pro 
Gly Gly Ala 
Arg Gly Ala 
Thr Phe Asp 
Arg Val He 
Glu Val Asp 
Asp Arg He 
Ala Val Pro 
Ser Gly Ala 
Arg Asp Gly 
Arg Tyr Glu 
Val Arg Thr 



Gin Asn Glu 
115 

Ser Ala Tyr 
130 

Gin He Ser 
145 

Val Ser Asp 
160 

Arg Gin He 
175 

Asp Ala Ser 
190 

Gly Arg Leu 
205 

Phe Asp Asp 
220 

Leu Leu He 
235 

Arg Arg He 
250 

Pro Pro Pro 
265 

Glu Glu Ala 
280 

Val Val Phe 
295 

Glu Ala Ala 
310 

Val Thr Ala 
325 

Ala Pro Leu 
340 

Leu He Pro 
355 

He Asp Leu 
370 

Ala Leu Val 



Leu Lys Ala 
120 

Gin Ser Phe 
135 

He Arg Asn 
150 

Gin Ala Gly 
165 

Arg Lys Gly 
180 

Leu Cys Thr 
195 

Ser Arg Ser 
210 

Ala Ala Val 
225 

Leu Glu Glu 
240 

Tyr Gly Glu 
255 

Trp Ser Gly 
270 

Leu Ala Asp 
285 

Ala Asp Ala 
300 

Ala He Thr 
315 

Pro Lys Thr 
330 

Asp Val Ala 
345 

Pro Thr He 
360 

Val Thr Gly 
375 

Val Ala Arg 
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380 385 390 

Gly His Gly Gly Phe Asn Ser Ala Val Val Val Arg Ala Pro Arg 

395 400 405 

Asp 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Thr Ser Glu Leu Leu Glu Arg Thr Ala Val Arg Ser Ala Thr 

5 10 15 

Ala Val Phe Thr Gly lie Gly Val Thr Ala Pro Asn Gly Leu Gly 

20 25 30 

Thr Ala Ala Trp Trp Gin Ala Thr Val Ala Gly Glu Ser Gly lie 

35 40 45 

Arg Pro Val Ser Arg Phe Asp Ala Ser Gly Tyr Pro Ser Thr Leu 

50 55 60 

Ala Gly Glu Val Pro Gly Phe Asp Ala Glu Glu His lie Pro Ser 

65 70 75 

Arg Leu Leu Ser Gin Thr Asp His Met Thr Arg Leu Ala Leu Thr 

80 85 90 

Ala Ala Lys Glu Ala Leu Glu Asp Ser Gly Ala Asp Pro Ala Glu 

95 100 105 

Met Pro Gin Tyr Ser Ala Gly Ala Val Thr Ala Ala Ser Ala Gly 

110 115 120 

Gly Phe Glu Phe Gly Gin Arg Glu Leu Gin Ala Leu Trp Ser Lys 

125 130 135 

Gly Gly Gin Tyr Val Ser Ala Tyr Gin Ser Tyr Ala Trp Phe Tyr 

140 145 150 

Ala Val Asn Thr Gly Gin lie Ser lie Arg His Gly Leu Arg Gly 

155 160 165 

Pro Ser Gly Val Leu Val Thr Glu Gin Ala Gly Gly Leu Glu Ala 

170 175 180 
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Val Ala Gin Ala Arg Arg Gin Leu Arg Lys Gly Ser Lys Leu lie 

185 190 195 

Val Thr Gly Gly Val Asp Gly Ala Val Cys Pro Trp Gly Trp Thr 

200 205 210 

Ala Gin Leu Ala Gly Gly Arg Met Ser Pro Val Ala Asp Pro Ala 

215 220 225 

Arg Ala Phe Leu Pro Phe Asp Ser Glu Ala Ser Gly Tyr Val Ala 

230 235 240 

Gly Glu Gly Gly Ala lie Leu Val Leu Glu Asp Ala Glu Ala Ala 

245 250 255 

Arg Glu Arg Gly Ala Arg lie Tyr Gly Arg Leu Ser Gly Tyr Ala 

260 265 270 

Ala Thr Phe Asp Pro Ala Pro Gly Arg Gly Gly Glu Pro Gly Leu 

275 280 285 

Arg Arg Ala Ala Glu Leu Ala Leu Thr Glu Ala Gly Leu Ser Ala 

290 295 300 

Ser Asp Val Asp Val Val Phe Ala Asp Ala Ser Gly Val Pro Glu 

305 310 315 

Leu Asp Arg Gin Glu Glu Ala Ala Leu Thr Ala Leu Phe Gly Pro 

320 325 330 

Arg Gly Val Pro Val Thr Ala Pro Lys Thr Met Thr Gly Arg Leu 

335 340 345 

Ser Ala Gly Gly Ala Ser Leu Asp Leu Ala Ala Ala Leu Leu Ser 

350 355 360 

lie Arg Asp Ala Val lie Pro Pro Thr Val Asn Val Thr Ser Pro 

365 370 375 

Val Ala Ala Asp Ala Leu Asp Leu Val Thr Glu Ala Arg Arg Gly 

380 385 390 

Pro Val Arg Thr Ala Leu Val Leu Ala Arg Gly Thr Gly Gly Phe 

395 400 405 

Asn Ala Ala Ala Val Val Thr Ala Ala Asn 

410 415 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
Met He Pro Val Ala Val Thr Gly Met Gly Val Ala Ala Pro Asn 

5 10 15 

Gly Leu Gly Ala Ala Asp Tyr Trp Ala Ala Thr Arg Gly Gly Lys 

20 25 30 

Ser Gly He Gly Arg He Thr Arg Phe Asp Pro Ser Ser Tyr Pro 

35 40 45 

Ala Arg Leu Ala Gly Glu He Pro Gly Phe Glu Ala Ala Glu His 

50 55 60 

Leu Pro Gly Arg Leu Leu Pro Gin Thr Asp Arg Val Thr Arg Leu 

65 70 75 

Ser Leu Ala Ala Ala Asp Trp Ala Leu Ala Asp Ala Gly Val Glu 

80 85 90 

Pro Glu Ser Phe Asp Pro Leu Asp Met Gly Val Val Thr Ala Gly 

95 100 105 

His Ala Gly Gly Phe Glu Phe Gly Gin Gly Glu Leu Gin Lys Leu 

110 115 120 

Trp Ala Lys Gly Ser Gin Phe Val Ser Ala Tyr Gin Ser Phe Ala 

125 130 135 

Trp Phe Tyr Ala Val Asn Ser Gly Gin He Ser He Arg His Gly 

140 145 150 

Met Lys Gly Pro Asn Gly Val Val Val Ser Asp Gin Ala Gly Gly 

155 160 165 

Leu Asp Ala Leu Ala Gin Ala Arg Arg Leu Val Arg Lys Gly Thr 

170 175 180 

Pro Leu He Val Cys Gly Ala Val Asp Ala Ser He Cys Pro Trp 

185 190 195 

Gly Trp Val Ala Gin Leu Ala Gly Gly Arg Met Ser Asp Ser Asp 

200 205 210 

Glu Pro Ala Arg Ala Tyr Leu Pro Phe Asp Arg Asp Ala Arg Gly 

215 220 225 

Tyr Leu Pro Gly Glu Gly Gly Ala lie Leu He Met Glu Pro Ala 

230 235 240 

Ala Ala Ala Arg Ala Arg Gly Ala Lys Val Tyr Gly Glu He Ser 

245 250 255 

Gly Tyr Gly Ala Thr Phe Asp Pro Pro Pro Gly Ser Gly Ser Gly 
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260 265 270 

Ser Thr Leu Arg Thr Ala lie Arg Val Ala Leu Asp Asp Ala Gly 

275 280 285 

Val Ala Pro Gly Asp Val Asp Ala Val Phe Ala Asp Gly Ala Gly 

290 295 300 

Val Pro Glu Leu Asp Arg Ala Glu Ala Glu Ala lie Thr Asp Val 

305 310 315 

Phe Gly Ser Gly Gly Val Pro Val Thr Val Pro Lys Thr Met Thr 

320 325 330 

Gly Arg Leu Tyr Ser Gly Ala Ala Pro Leu Asp Val Ala Cys Ala 

335 340 345 

Leu Leu Ala Met Gin Ala Gly Val lie Pro Pro Thr Val His lie 

350 355 360 

Asp Pro Cys Pro Glu Tyr Gly Leu Asp Leu Val Leu His Gin Ala 

365 370 375 

Arg Pro Ala Thr Val Arg Thr Ala Leu Val Leu Ala Arg Gly His 

380 385 390 

Gly Gly Phe Asn Ser Ala Met Ala Val Arg Ala Gly Arg 

395 400 



(2) I NFORMAT I ON FOR SEQ ID NO: 28 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 07 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
Met Ser Ala Arg Phe Leu Val Thr Gly He Gly Val Ala Ala Pro 

5 10 15 

Ser Gly Leu Gly Val Glu Asp Phe Trp Ser Val Thr Arg He Gly 

20 25 30 

Lys Asn Ala He Gly Pro Val Thr Arg Phe Asp Ala Ser Ala Tyr 

35 40 45 

Pro Ser Arg Leu Ala Gly Glu He His Gly Phe Glu Pro Lys Glu 

50 55 60 

His Leu Pro Gly Arg Leu Val Pro Gin Thr Asp Arg Val Thr Gin 

65 70 75 
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Leu Ala Leu 
Glu Pro Gly 
Ala Gly Ala 
Leu Trp Ser 
Ala Trp Phe 
Gly Leu Arg 
Gly Leu Asp 
Ser Lys Leu 
Leu Gly Trp 
Glu Arg Thr 
Ala Ala Gly 
Glu Asp Glu 
Gly Glu Phe 
Ser Gly Arg 
Thr Asp Ala 
Asp Gly Ala 
He Thr Ala 
Lys Thr Met 
Val Val Ser 



Val Ala Ala 
80 

Thr He Asp 
95 

Gly Gly Phe 
110 

Glu Gly Ala 
125 

Tyr Ala Val 
140 

Gly Pro Ala 
155 

Ala Leu Ala 
170 

He Ala Thr 
185 

Ala Ser Gin 
200 

Glu Pro Glu 
215 

Tyr Val Pro 
230 

Asp Ser Ala 
245 

Ala Gly Tyr 
260 

Glu Pro Gly 
275 

Ala Cys His 
290 

Ala Thr Pro 
305 

Val Phe Gly 
320 

Thr Gly Arg 
335 

Ala Val Leu 
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Asp Cys Ala 

Pro Tyr Ala 
Glu Phe Ala 
Lys Arg Val 
Asn Ser Gly 
Gly Val Val 
Gin Ala Arg 
Gly Gly Phe 
Pro Arg Thr 
Arg Ala Tyr 
Gly Glu Gly 
Arg Asp Arg 
Gly Ala Thr 
Leu Arg Arg 
Pro Ala Glu 
Arg Leu Asp 
Pro Arg Ala 
He Asn Ser 
Ser Met Arg 



Phe Ala Asp 
85 

Met Gly Val 
100 

Glu Asn Glu 
115 

Ser Ala T w r 
130 

Gin He Ser 
145 

lie Ser Asp 
160 

Arg Gin Leu 
175 

Asp Ala Pro 
190 

Gly Gly Leu 
205 

Leu Pro Phe 
220 

Gly Ala Met 
235 

Gly Ala Arg 
250 

Leu Asp Pro 
265 

Ala He Asp 
280 

Val Glu Val 
295 

Arg Glu Glu 
310 

Val Pro Val 
325 

Gly Gly Ala 
340 

Glu Gly Leu 



Ala Gly He 
90 

Val Thr Ala 
105 

Leu Arg Lys 
120 

Gin Ser Phe 
135 

He Arg Asn 
150 

Gin Ala Gly 
165 

Arg Lys Gly 
180 

lie Cys Ser 
195 

Met Phe His 
210 

Glu Asp Ala 
225 

Leu He Leu 
240 

Thr Val Tyr 
255 

Lys Pro Gly 
270 

Val Ala Leu 
285 

Val Phe Ala 
300 

Ala Glu Ala 
315 

Thr Val Pro 
330 

Pro He Asp 
345 

lie Pro Pro 
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350 355 360 

Thr Thr Asn Val Glu Leu Ser Asp Ala Tyr Asp Leu Asp Leu Val 

365 370 375 

Ala Val Arg Pro Arg Thr Ala Ser Val Arg Thr Ala Leu Val Leu 

380 385 390 

Ala Arg Gly Arg Gly Gly Phe Asn Ser Ala Val Val Val Arg Ala 

395 400 405 

Val Asp 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GGATCTGCTT GAGGTAGTCT ACGAGGCACT GGAGTCAGCA GGGTACTTTG 50 
GCGCCAAGTC AAACCCGGAA CCTGATGACT ATGGATGCTA TATCGGTGCA 100 
GTGATGAACA ACTACTATGA CAACGTTTCT TGCCATCCAC CCACCGCATA 150 
CGCTACTCTT GGAACGTCGC GTTGCTTCCT TAGTGGCTGC ATGAGCCATT 2 00 
ACTTTGGATG GACGGGACCT TCCTTGACCA TTGATACGGC TTGCTCGTCA 250 
TCACTAGTTG CTATAAACAC CGCTTGTAGA GCAATATGGT CTGGTGAGTG 3 00 
CTCCCGGGCC ATAGCTGGGG GTACCAATGT CTTCACAAGT CCGTTTGACT 3 50 
ACCAGAATCT TCGCGCCGCA GGATTCCTCA GCCCTAGCGG GCAATGCAAG 4 00 
CCGTTTGATG CTTCTGCTGA TGGCTACTGC CGTGGAGAAG GAGTTGGTGT 450 
CGTTGTGCTT AAGCCTTTGA CGGCTGCTAT GCAAGAGAAC GATAACATCC 500 
TTGGCGTCAT TGTGGGGTCT GCAGCAAACC AAAACCAAAA CCTCAGTCAT 550 
ATCACGGTGC CCCATTCGGG CTCACAAGTC CAGCTTTATC GAAAGGTGAT 600 
GAAGCTTGCA GGTATAGAGC CAGAGTCAGT CTCCTACGTT GAG 643 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO:30: 

lie Leu Leu Gin Val Ala Tyr Glu Ala Leu Glu Met Ser Gly Tyr 

5 10 15 

Phe Ala Asp Ser Ser Arg Pro Glu Asp Val Gly Cys Tyr lie Gly 

20 25 30 



Ala Cys Ala Thr Asp Tyr Asp Phe Asn Val Ala Ser His Pro Pro 
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35 40 45 

Thr Ala Tyr Ser Ala Thr Gly Thr Leu Arg Ser Phe Leu Ser Gly 

50 55 60 

Lys Leu Ser His Tyr Phe Gly Trp Ser Gly Pro Ser Leu Val Leu 

65 70 75 

Asp Thr Ala Cys Ser Ser Ser Ala Val Ala lie His Thr Ala Cys 

80 35 90 

Thr Ala Leu Arg Thr Gly Gin Cys Ser Gin Ala Leu Ala Gly Gly 

95 100 105 

lie Thr Leu Met Thr Ser Pro Tyr Leu Tyr Glu Asn Phe Ser Ala 

110 115 120 

Ala His Phe Leu Ser Pro Thr Gly Gly Ser Lys Pro Phe Ser Ala 

125 130 135 

Xaa Ala Asp Gly Tyr Cys Arg Gly Glu Gly Gly Gly Leu Val Val 

140 145 150 

Leu Lys Arg Leu Ser Asp Ala Leu Arg Asp Asp Asp His lie lie 

155 160 165 

Ser Val lie Ala Gly Ser Ala Val Asn Gin Asn Asp Asn Cys Val 

170 175 180 

Pro lie Thr Val Pro His Thr Ser Ser Gin Gly Asn Leu Tyr Glu 

185 190 195 

Arg Val Thr Arg Gin Ala Gly Val Thr Pro Asn Lys Val Thr Phe 

200 205 210 

Val Glu 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE : no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
AATCCTCATG GAATCAGCTT GGCAAACACT AGAAAACGCT GGCATAACTG 50 
CGAACAAAGT AGCTGGCAGC AGTACAGGAG TTTTTGTGGG TGCTAGTGGC 10 0 
TCTGATTACT GTTGGGTAAT GGAGCGGGTA GGTATTCCCA TAGAAGCTCA 15 0 
CGTTGCAACG GGCACGTCGT TGGCAGCGCT GGCAAATCGC ATCTCTTACT 200 
TTTTTGACTT GCGAGGCCCA AGCATCGTCA TTGATACGGC GTGTTCTAGT 2 50 
TCGTTGATGG CAGTGCATCA GGCGGTTCAA TCTATCCGAG CAGGTGAGTG 3 00 
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CTTACAAGCA CTGGTGGGCG GTATACATAT 
GTATTGCATA TTACAAGGCT GGGATGTTGG 
ACATTTGACG ATCGCGCAGA TGGGTACGTT 
GCTTCTGCTC AAGCAATTGC ATCAGGCGGA 
ATGCGACAAT CAAGGGGTCA GCCTCGAATC 
CTCACCGTAC CGAATCCGCA ACAGCAGGCA 
GAAAGCCTCT GGTGTAGACC CTAACACGAT 



CATGAGCCAT CCGGCTAACA 3 50 

CGCATGATGG CAAGTGCAAG 4 00 

CGCAGTGAAG GCGCTGTGAT 4 50 

AGCAGATGGC GATCTAATTT 500 

ATGGTGGACA GTCCGCCGGC 55 0 

GCACTCTTAA CCAATGCCTG 60 0 

TAGTTTTATC GAA 643 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

lie Leu Met Glu Ser Ala Trp Gin Thr Leu Glu Asn Ala Gly lie 

5 10 15 

Thr Ala Asn Lys Val Ala Gly Ser Ser Thr Gly Val Phe Val Gly 

20 25 30 

Ala Ser Gly Ser Asp Tyr Cys Trp Val Met Glu Arg Val Gly lie 

35 40 45 

Pro He Glu Ala His Val Ala Thr Gly Thr Ser Leu Ala Ala Leu 

40 55 60 

Ala Asn Arg He Ser Tyr Phe Phe Asp Leu Arg Gly Pro Ser He 

65 70 75 

Val He Asp Thr Ala Cys Ser Ser Ser Leu Met Ala Val His Gin 

80 85 90 

Ala Val Gin Ser He Arg Ala Gly Glu Cys Leu Gin Ala Leu Val 

95 100 105 

Gly Gly He His lie Met Ser His Pro Ala Asn Ser He Ala Tyr 

110 115 120 

Tyr Lys Ala Gly Met Leu Ala His Asp Gly Lys Cys Lys Thr Phe 

125 130 135 

Asp Asp Arg Ala Asp Gly Tyr Val Arg Ser Glu Gly Ala Val Met 

140 145 150 

Leu Leu Leu Lys Gin Leu His Gin Ala Glu Ala Asp Gly Asp Leu 

155 160 165 



He Tyr Ala 



Thr He Lys Gly Ser Ala Ser Asn His Gly Gly Gin 
170 175 180 
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Ser Ala Gly Leu Thr Val Pro Asn Pro Gin Gin Gin Ala Ala Leu 

185 190 195 

Leu Thr Asn Ala Trp Lys Ala Ser Gly Val Asp Pro Asn Thr lie 

200 205 210 

Ser Phe lie Glu 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 637 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
TATATTACTC CAGGTTGCTT ACGAAGCATT GGAAATGTCC GGGTATTTCG 5 0 
CCGACTCGTC CAAGCCTGAG GACGTAGGTT GCTATATTGG AGCTTGTGCA 10 0 
ACAGATTACG ATTTCAGCGT AGCGTCCCAT CCTCCTACGG CATACTCAGC 15 0 
AACTGGCACG CTCCGATCTT TCCTGAGTGG CAAGCTGTCA CATTACTTTG 2 00 
GTTGGTCTGG TCCCTCTCTT GTCCTGGACA CCGCCTGCTC TTCATCGGCG 2 50 
GTGGCCATTC ACACTGCATG TACTGCTTTG AGGACTGGCC AGTGTTCTCA 3 00 
GGCTTTAGCA GGCGGGATTA CTTTGATGAC CAGCCCGTAT CTCTTTGAGA 3 50 
ACTTTGCTGC CGCCCATTTC TTGAGCCCAA CGGGAGGCTC AAAGCCGTTC 400 
AGTGCAGATG CAGATGGGTA TTGTAGAGGA GAAGGGGGTG GGCTCGTGGT 450 
CTTGAAACGA CTTTCAGATG CTATCAGGGA TAACGACCAC ATCATTAGCG 500 
TCATCGCTGG CTCAGCCGTC AACCAGAACG CTAACTGTGT GCCTATCACC 550 
GTCCCTCATA CTTCGTCTCA GGGCAATCTC TATGAACGAG TTACCGCACA 600 
GGCAGGGGTG ACACCTAATA AGGTCACTTT TGTGGAA 63 7 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

lie Leu Leu Gin Val Ala Tyr Glu Ala Leu Glu Met Ser Gly Tyr 

5 10 15 

Phe Ala Asp Ser Ser Lys Pro Glu Asp Val Gly Cys Tyr lie Gly 

20 25 30 

Ala Cys Ala Thr Asp Tyr Asp Phe Ser Val Ala Ser His Pro Pro 

35 40 45 



Thr Ala Tyr Ser Ala Thr Gly Thr Leu Arg Ser Phe Leu Ser Gly 

50 55 60 
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Lys Leu Ser His Tyr 

65 

Asp Thr Ala Cys Ser 

80 

Thr Ala Leu Arg Thr 

95 

lie Thr Leu Met Thr 

110 

Ala His Phe Leu Ser 

125 

Asp Ala Asp Gly Tyr 

140 

Leu Lys Arg Leu Ser 

155 

Ser Val He Ala Gly 

170 

Pro He Thr Val Pro 

185 

Arg Val Thr Ala Gin 

200 

Val Glu 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 691 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
CCATCTGCTA GAAATCAGCT ACGAGGCGCT CGAGAATGCA GGCTTTCCAC 50 
TGCCTAGCAT TGCTGGCACG AACATGGGTG TCTTTGTCGG CGGAAGCAAC 100 
TCTGAGTATC GAGCGCACAT CGGAAACGAT ACCGACAACT TACCGATGTT 150 
TGAAGCAACA GGCAATGCAG AATCTCTGCT GGCGAATCGA GTCTCTTATG 200 
TGTATGATCT CCACGGCGCA AGTCTGACGA TTGGTACCGC TTGTTCCGTC 25 0 
GAGTTTAGCA GCTTTGGATA GCGCGTTTCT CAGCTTGCAG CTGGTAAGTC 300 
GTCCACAGCA ATTGTTGCCG GCTCCGTTGT TCGAATCGTA CCGTCATCGA 350 
CCATCTCACC TTCTACTATG AAGTAAGCAG TCATGGCTCT TGACACGGAG 400 
ACTACTCACC ATTCCAGGCT TCTGTCACCA GAAGGGCGGT GTTATGCGTT 450 
CGATGACAGA GCCACTAGTG GTTTTGGAAG GGGTGAAGGT TCTGCCTGCA 500 
TAATATTGGA AACCTTAGAG GCAGCCTTAA GAGACAACGA CCCAATCCGA 550 
TCGGTCATTC GCAATTCGGG AGTCAATCAA GATGGTAAAA CTGCAGGTAT 600 
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Phe Gly Trp Ser Gly Pro Ser Leu Val Leu 

70 75 

Ser Ser Ala Val Ala lie His Thr Ala Cys 

85 90 

Gly Gin Cys Ser Gin Ala Leu Ala Gly Gly 

100 105 

Ser Pro Tyr Leu Phe Glu Asn Phe Ala Ala 

115 120 

Pro Thr Gly Gly Ser Lys Pro Phe Ser Ala 

130 135 

Cys Arg Gly Glu Gly Gly Gly Leu Val Val 

145 150 

Asp Ala lie Arg Asp Asn Asp His lie lie 

160 165 

Ser Ala Val Asn Gin Asn Ala Asn Cys Val 

175 180 

His Thr Ser Ser Gin Gly Asn Leu Tyr Glu 

190 195 

Ala Gly Val Thr Pro Asn Lys Val Thr Phe 

205 210 
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CACAATGCCA AATGGGGAAG CGCAAGCTTC ATTGATACAA TCTGTTTATC 650 
GCACTGCTGG ATTGGACCCT CTGCAGACAG ATTACGTCGA G 6 91 



(2) INFORMATION FOR SEQ ID NO : 3 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

His Leu Leu Glu. lie Ser Tyr Glu Ala Leu Glu Asn Ala Gly Phe 

5 10 15 

Pro Leu Pro Ser lie Ala Gly Thr Asn Met Gly Val Phe Val Gly 

20 25 30 

Gly Ser Asn Ser Glu Tyr Arg Ala His lie Gly Asn Asp Thr Asp 

35 40 45 

Asn Leu Pro Met Phe Glu Ala Thr Gly Asn Ala Glu Ser Leu Leu 

50 55 60 

Ala Asn Arg Val Ser Tyr Val Tyr Asp Leu His Gly Ala Ser Leu 

65 70 75 

Thr He Gly Thr Ala Cys Ser Val Glu Phe Ser Ser Phe Gly Xaa 

80 85 90 

Arg Val Ser Gin Leu Ala Ala Gly Lys Ser Ser Thr Ala He Val 

95 100 105 

Ala Gly Ser Val Val Arg He Val Pro Ser Ser Thr He Ser Pro 

110 115 120 

Ser Thr Met Lys Leu Leu Ser Pro Glu Gly Arg Cys Tyr Ala Phe 

125 130 135 

Asp Asp Arg Ala Thr Ser Gly Phe Gly Arg Gly Glu Gly Ser Ala 

140 145 150 

Cys lie lie Leu Glu Thr Leu Glu Ala Ala Leu Arg Asp Asn Asp 

155 160 165 

Pro He Arg Ser Val He Arg Asn Ser Gly Val Asn Gin Asp Gly 

170 175 180 

Lys Thr Ala Gly He Thr Met Pro Asn Gly Glu Ala Gin Ala Ser 

185 190 195 

Leu He Gin Ser Val Tyr Arg Thr Ala Gly Leu Asp Pro Leu Gin 

200 205 210 
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Thr Asp Tyr Val Glu 

215 



(2) INFORMATION FOR SEQ ID NO: 37 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
AACTGTTAGA GGTCAGTTAC GAGGCGTTTG AGAATGCGGG CATATCATTA 50 
TCGAGTGTTG CAGGTACCGA CGTTGGGGTA TTCATCAGTG CCAGCACCAA 100 
TGATTACCGT TTCGTTTTCC ACAACGACCT CGACACATTG CCAATGTTTG 150 
AATCCACTGG GAGTGAATTA TCGATCATGT CCAATCGTAT CTCCTATACT 2 00 
TTCAATCTTA GAGGTCCAAG TATGACGATT GATACTCCCT GTTCCTCAAG 250 
TTTGATCGCA CTCCATACAG CATTCAGAAG TCTACAGGTC GGAGAAAGCT 3 00 
CTTGCGCCAT TGTCGGTGGA TCTAACCTCC ACATCACTCC AGATTCCTAC 3 50 
ATTTCATTCT CGACGATGAG GTAAGCACTA TCGTTTGCGA ATTACCTATC 4 00 
TTTGATTACG AGTGACTAAG TTGTACAGGC TCCTGTCGCC CCATGGACGA 450 
TCGTGCAGTC AATGGGTTTG GGCGCGGAGA GGGCACAAGT TGCATAATAC 500 
TGAAGCCTTT AGATGCCGCA TTGAAAGACC ACGATCCCAT AAGGGCAGTT 550 
ATTCGCAATA CGGGCACTAA TCAAGATGGG AAGACGACAG GTATCACGAT 600 
GCCGAATGGT GAAGCACAGG CCGCCTTAAT GCAATCAGTC TACGAGGCAG 650 
CGGGCTTAGA TCCCCTTGAA ACAGACTATG 680 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO:38: 

Leu Leu Glu Val Ser Tyr Glu Ala Phe Glu Asn Ala Gly lie Ser 

5 10 15 

Leu Ser Ser Val Ala Gly Thr Asp Val Gly Val Phe He Ser Ala 

20 25 30 

Ser Thr Asn Asp Tyr Arg Phe Val Phe His Asn Asp Leu Asp Thr 

35 40 45 

Leu Pro Met Phe Glu Ser Thr Gly Ser Glu Leu Ser He Met Ser 

50 55 60 



Asn Arg He Ser Tyr Thr Phe Asn Leu Arg Gly Pro Ser Met Thr 

65 70 75 
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Ile Asp Thr Pro Cys Ser Ser Ser Leu lie Ala Leu His Thr Ala 

80 85 90 

Phe Arg Ser Leu Gin Val Gly Glu Ser Ser Cys Ala lie Val Gly 

95 100 105 

Gly Ser Asn Leu His lie Thr Pro Asp Ser Tyr lie Ser Phe Ser 

110 115 120 

Thr Met Ser Cys Thr Gly Ser Cys Arg Pro Met Asp Asp Arg Ala 

125 130 135 

Val Asn Gly Phe Gly Arg Gly Glu Gly Thr Ser Cys He He Leu 

140 145 150 

Lys Pro Leu Asp Ala Ala Leu Lys Asp His Asp Pro He Arg Ala 

155 160 165 

Val He Arg Asn Thr Gly Thr Asn Gin Asp Gly Lys Thr Thr Gly 

170 175 180 

He Thr Met Pro Asn Gly Glu Ala Gin Ala Ala Leu Met Gin Ser 

185 190 195 

Val Tyr Glu Ala Ala Gly Leu Asp Pro Leu Glu Thr Asp Tyr 

200 205 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 691 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GCATTTGCTG GAGGTGAGCT ATGAAGCGCT TGAAAATGCT GGCCTTTCTC 50 

TTCCTTGCAT TGCCGGCACC AAAATGGGAG TCTTCGTTGG TGGAGGCAAT 100 

GCAKAGTATC GATCGCATAT CGGCCAAGAT ATTGACAATC TGCCTATGTT 150 

CGAGGCAACT GGTAACGCAG AGGCGCTATT GGCGAATAGA GTTTCTTATG 2 00 

TATATGATCT TCGAGGACCG AGTCTAACCA CCGATACCGC CTGTTCCTCA 250 

AGTCTCGCCG CTTTGAACAC GGCATTCTTA AGTCTACAGG CTGGCGAGTC 3 00 

GTCTACAGCA CTGGTCGGTA GCTCAGTAAT TCGGCTTAGG CCTGAGTCAG 350 

CCATCTCACT TTCCAGCATG CAGTAAGTCC TTCATGGTGC ACCTGCATAC 400 

ATTGCTAATA AGTGCAGGCT TCTATCCCCA GATGGAAAAT CTTACGCGTT 450 

CGATGAGAGA GCTACCAGTG GTTTTGGAAG GGGTGAGGGT TCGGGTTGCA 500 

TAATACTAAA ACCCCTGGAC GCAGCCGTGA GAGACGGAGA CCCAATTAGA 550 

GCAGTCATTT GTAACTCGGG TGTAAACCAA GACGGCAAGA CTGCTGGTAT 600 

TACAATGCCT AATGGACACG CGCAAGCTTC TCTAATACGG TCTGTTTATC 650 

AGTCTACAGG GATAGACCCT TTAATGACGG ACTATGTCGA A 691 



(2) INFORMATION FOR SEQ ID NO:40: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

His Leu Leu Glu Val Ser Tyr Glu Ala Leu Glu Asn Ala Gly Leu 

5 10 15 

Ser Leu Pro Cys lie Ala Gly Thr Lys Met Gly Val Phe Val Gly 

20 25 30 

Gly Gly Asn Ala Xaa Tyr Arg Ser His lie Gly Gin Asp lie Asp 

35 40 45 

Asn Leu Pro Met Phe Glu Ala Thr Gly Asn Ala Glu Ala Leu Leu 

50 55 60 

Ala Asn Arg Val Ser Tyr Val Tyr Asp Leu Arg Gly Pro Ser Leu 

65 70 75 

Thr Thr Asp Thr Ala Cys Ser Ser Ser Leu Ala Ala Leu Asn Thr 

80 85 90 

Ala Phe Leu Ser Leu Gin Ala Gly Glu Ser Ser Thr Ala Leu Val 

95 100 105 

Gly Ser Ser Val lie Arg Leu Arg Pro Glu Ser Ala lie Ser Leu 

110 115 120 

Ser Ser Met Gin Leu Leu Ser Pro Asp Gly Lys Ser Tyr Ala Phe 

125 130 135 

Asp Glu Arg Ala Thr Ser Gly Phe Gly Arg Gly Glu Gly Ser Gly 

140 145 150 

Cys lie lie Leu Lys Pro Leu Asp Ala Ala Val Arg Asp Gly Asp 

155 160 165 

Pro lie Arg Ala Val lie Cys Asn Ser Gly Val Asn Gin Asp Gly 

170 175 180 

Lys Thr Ala Gly He Thr Met Pro Asn Gly His Ala Gin Ala Ser 

185 190 195 

Leu He Arg Ser Val Tyr Gin Ser Thr Gly He Asp Pro Leu Met 

200 205 210 

Thr Asp Tyr Val Glu 

215 
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(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 7 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 
GCTGTTTCTT CAAACTAGCT GGCAATGCAT TGAAGATGCG GGATATAACC 50 
CCACATCCTT TGCAGGTAGC AAGTGTGGCG TATTTGTCGG CTGCGAAACG 10 0 
GGAGACTATG GAAAGATTGT GCAGCGATAT GAATTGAGCG CTCTCGGATT 150 
GCTAGGCTCT TCTGCGGCAC TGCTCCCGGC AAGGATCTCC TATTTCCTCA 2 00 
ACCTCCAGGG CCCTTGTATG GCGATCGACA CAGCCTGCTC TGCATCCCTA 2 50 
GTTGCCATAG CCAACGCCTG CGACAGCCTG GTACTGGGTC ACTCCGATGC 3 00 
AGCCTTGGCC GGAGGAGTCT ACGTCCTCTC CGGGCCGGAA ATGCACATTA 3 50 
TGATGAGCAA AGCTGGTATC TTGTCACCCG ATGGCAGATG TTTCACCTTC 4 00 
GATCGACGTG CTAACGGCTT TGTACCGGGC GAAGGTGTGG GCGTCGTGTT 450 
ACTCAAACGC CTTGCCGATG CCGAAAAAGA CGGTGATAAT ATCTGTGGTG 500 
TGATTCGAGG CTGGGGGGTG AATCAAGACG GCAAGACCAG TGGAATTACA 550 
GCACCTAACG GACAGTCACA GCAACGATTG CAGAAAGAAG TCTACGAACG 600 
GTTTCAGATT CAGCCAGCAG ACATTCAACT GGTTGAG 63 7 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
Leu Phe Leu Gin Thr Ser Trp Gin Cys lie Glu Asp Ala Gly Tyr 

5 10 15 

Asn Pro Thr Ser Phe Ala Gly Ser Lys Cys Gly Val Phe Val Gly 

20 25 30 

Cys Glu Thr Gly Asp Tyr Gly Lys He Val Gin Arg Tyr Glu Leu 

35 40 45 

Ser Ala Leu Gly Leu Leu Gly Ser Ser Ala Ala Leu Leu Pro Ala 

50 55 60 

Arg He Ser Tyr Phe Leu Asn Leu Gin Gly Pro Cys Met Ala He 

65 70 75 



Asp Thr Ala Cys Ser Ala Ser Leu Val Ala He Ala Asn Ala Cys 

80 85 90 
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He 


Gin 
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He 


Gin 


Leu 
210 



Val Glu 



(2) INFORMATION FOR SEQ ID NO:43: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
GATGATGATA GAAGTCGCTT ACCAAGGACT TGAGAGTGCA GGGCTGTCTC 50 
TTCAGGATGT TGCCGGATCG AGGACTGGAG TCTTCATTGG CCATTTCAGC 100 
AGTGATTACC GAGACATGAT ATTCAGAGAT CCCGAGAGGG CACCGACCTA 150 
CACTTTCAGT GGGGTTAGTA AGACGTCATT GGCGAATCGC ATCTCCTGGC 2 00 
TGTTCGACCT GAAAGGCCCA AGTTTCAGCT TGGACACAGC CTGCTCGTCG 2 50 
AGTCTGGTCG CCCTGCATTT GGCTTGCCAA AGCTTACGCG CTGGAGAGTC 3 00 
AGATATCGCC ATTGTCGGAG GGGTCAACCT TCTCTGGAAT CCGGAGTTGT 3 50 
TCATGTATCT CTCCAATCAG CACTTTCTCT CGCCAGATGG GAAATGTAAA 400 
AGCTTTGACG AATCCGGCGA TGGCTATGGT CGTGGCGAAG GCATTGCCGC 450 
TCTTGTACTA AGAAGAGTCG ACGACGCGAT TGCGGCCCGG GACCCTATTC 50 0 
GTGCCATCAT TCGCGGTACT GGGAGTAATC AGGACGGACA CACCAAAGGC 550 
TTCACCCTCC CCAGCGCAGA AGCCCAGGCG AGGTTGATTA GAGATACGTA 600 
CTCTGCCGCG GGGCTAGGTT TTAGAGACAC GCGATACGTA GAA 643 

(2) INFORMATION FOR SEQ ID NO:44: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Met Met lie Glu Val Ala Tyr Gin Gly Leu Glu Ser Ala Gly Leu 

5 10 15 

Ser Leu Gin Asp Val Ala Gly Ser Arg Thr Gly Val Phe lie Gly 

20 25 30 

His Phe Ser Ser Asp Tyr Arg Asp Met lie Phe Arg Asp Pro Glu 

35 40 45 

Arg Ala Pro Thr Tyr Thr Phe Ser Gly Val Ser Lys Thr Ser Leu 

50 55 60 

Ala Asn Arg lie Ser Trp Leu Phe Asp Leu Lys Gly Pro Ser Phe 

65 70 75 

Ser Leu Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser Leu Arg Ala Gly Glu Ser Asp lie Ala lie Val 

95 100 105 

Gly Gly Val Asn Leu Leu Trp Asn Pro Glu Leu Phe Met Tyr Leu 

110 115 120 

Ser Asn Gin His Phe Leu Ser Pro Asp Gly Lys Cys Lys Ser Phe 

125 130 135 

Asp Glu Ser Gly Asp Gly Tyr Gly Arg Gly Glu Gly lie Ala Ala 

140 145 150 

Leu Val Leu Arg Arg Val Asp Asp Ala lie Ala Ala Arg Asp Pro 

155 160 165 

lie Arg Ala lie lie Arg Gly Thr Gly Ser Asn Gin Asp Gly His 

170 175 180 

Thr Lys Gly Phe Thr Leu Pro Ser Ala Glu Ala Gin Ala Arg Leu 

185 190 195 

lie Arg Asp Thr Tyr Ser Ala Ala Gly Leu Gly Phe Arg Asp Thr 

200 205 210 

Arg Tyr Val Glu 

(2) INFORMATION FOR SEQ ID NO: 45: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 



WO 98/53097 



PCT/CA98/00488 



-51 - 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 
RGTCCTTATG GAGACCGTCT ACGAGGCAAT TGAGTCTGCG GGTATGACTT 5 0 
TGAAGGGGCT GCAAGGCAGC GACACAAGTG TGTATGCCGG CGTCATGTGT 100 
GGCGACTACG AGGCCATACA GCTCCGCGAT CTGGACGCGG CCCCGACTTA 150 
TTTCGCAGTG GGAACCTCGC GAGCTATCCT CTCCAATCGA ATCTCGTATT 2 00 
TCTTCAACTG GCACGGCGCG TCCATCACCA TGGACACGGC ATGTTCCTCT 250 
AGTCTGGTCG CCATTCACTT GGCCGTTCAG RCGCTTCGGG CAAATGAATC 3 00 
ACGRATGGCC GTGGCGTGTG GGTCGAACCT CATTCTCGGA CCCGAGAGTT 3 50 
ACATTATTGA AAGCAAGGTG AAGATGCTGT CCCCGGACGG TCTCAGCCGA 4 00 
ATGTGGGATA AAGACGCCAA CGGCTATGCG CGTGGAGATG GCGTTGCGGC 4 50 
CGTTGTTTTG AAGACTCTCA GCGCCGCGCT GGCGGACGGA GACCACATTG 500 
AATGTCTCAT ACGGGAGACG GGACTCAACC AGGACGGTGC GACAGCCGGT 5 50 
CTCACCATGC CTAGCGCCAC TGCGCAGCGA GCTCTTATTC ACAGTACGTA 600 
CACCAAGGCA GGTCTTGATC TCACTGCCCA GGCAGACCGT CCCCAGTATT 650 
TCGAG 655 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

Val Leu Met Glu Thr Val Tyr Glu Ala lie Glu Ser Ala Gly Met 

5 10 ~ 15 

Thr Leu Lys Gly Leu Gin Gly Ser Asp Thr Ser Val Tyr Ala Gly 

20 25 30 

Val Met Cys Gly Asp Tyr Glu Ala lie Gin Leu Arg Asp Leu Asp 

35 40 45 

Ala Ala Pro Thr Tyr Phe Ala Val Gly Thr Ser Arg Ala lie Leu 

50 55 60 

Ser Asn Arg lie Ser Tyr Phe Phe Asn Trp His Gly Ala Ser lie 

65 70 75 

Thr Met Asp Thr Ala Cys Ser Ser Ser Leu Val Ala lie His Leu 

80 85 90 

Ala Val Gin Xaa Leu Arg Ala Asn Glu Ser Arg Met Ala Val Ala 

95 100 105 

Cys Gly Ser Asn Leu lie Leu Gly Pro Glu Ser Tyr lie lie Glu 

110 115 120 



Ser Lys Val Lys Met Leu Ser Pro Asp Gly Leu Ser Arg Met Trp 
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125 130 135 

Asp Lys Asp Ala Asn Gly Tyr Ala Arg Gly Asp Gly Val Ala Ala 

140 145 150 

Val Val Leu Lys Thr Leu Ser Ala Ala Leu Ala Asp Gly Asp His 

155 160 165 

lie Glu Cys Leu lie Arg Glu Thr Gly Leu Asn Gin Asp Gly Ala 
170 175 180 

Thr Ala Gly Leu Thr Met Pro Ser Ala Thr Ala Gin Arg Ala Leu 

185 190 195 

lie His Ser Thr Tyr Thr Lys Ala Gly Leu Asp Leu Thr Ala Gin 
200 205 210 

Ala Asp Arg Pro Gin Tyr Phe Glu 

215 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 754 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
AGGTCTGTTG GAGACGGTTT ATCGCGCCTT TGAAAACGGT AAGGCCACCC 50 
TGGGAATAAA CCGGCTTCTC GTCCTGACGG CTTACTCTAT GCTAGCTGGT 100 
ATACCCATGG AGCAGGTCCT CGGGTCGAAG ACATCCGTTT ACGTGGGATG 150 
TTTCACCCGC GAGTTCGAGC AGTTGCTCGC GAGGGACCCC GAGATGAATC 2 00 
TGAAATACAT CGCTACGGGC ACCGGCACGG CGATGCTGTC GAATCGCCTC 250 
TCCTGGTTCT ATGACTTGAA AGGCGCCAGT ATCACTCTTG ATACTGCCTG 3 00 
TTCGTCCAGT CTCAATGCGT GCCATCTTGC TTGCGCAAGC TTACGTAATG 350 
GAGAAGCCAA TATGGTAAGA CTCCAACTCA TCGCGGGACT GAACAATTGC 4 00 
ATACTGATCC ATCAAAGGCC CTGGTAGGAG GCTGCAATCT TTTCTATAAC 450 
CCGGAAACGA TCATCCCTCT GACAAATCTA GGCTTTCTTT CTCCGGATAA 500 
CAAATGTTAT AGTTTTGACC ATCGTGCTAA CGGTTACTCT CGCGGCGAGG 550 
GGTTTGGTAT TCTTGTATTG AAGAGACTGT CGGACGCTCT ACGCGATAAC 600 
GACACTGTCC GTGCAGTGAT TCGGGCCTCT TCGTCTAACC AGGATGGCAA 650 
GTCTCCCGGT ATCACACAGC CTACCAAACA AGCGCAAATA CAACTGATCA 70 0 
AAGACACTTA CGCGGCTGCC GGGCTGGACT ATACGCAAAC CCGCTACTTC 750 
GANA 754 



(2) INFORMATION FOR SEQ ID NO: 48 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Gly Leu Leu Glu Thr Val Tyr Arg Ala Phe Glu Asn Ala Gly lie 

5 10 15 

Pro Met Glu Gin Val Leu Gly Ser Lys Thr Ser Val Tyr Val Gly 

20 25 30 

Cys Phe Thr Arg Glu Phe Glu Gin Leu Leu Ala Arg Asp Pro Glu 

35 40 45 

Met Asn Leu Lys Tyr lie Ala Thr Gly Thr Gly Thr Ala Met Leu 

50 55 60 

Ser Asn Arg Leu Ser Trp Phe Tyr Asp Leu Lys Gly Ala Ser lie 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Leu Asn Ala Cys His Leu 

80 85 90 

Ala Cys Ala Ser Leu Arg Asn Gly Glu Ala Asn Met Ala Leu Val 

95 100 105 

Gly Gly Cys Asn Leu Phe Tyr Asn Pro Glu Thr lie lie Pro Leu 

110 115 120 

Thr Asn Leu Gly Phe Leu Ser Pro Asp Asn Lys Cys Tyr Ser Phe 

125 130 135 

Asp His Arg Ala Asn Gly Tyr Ser Arg Gly Glu Gly Phe Gly lie 

140 145 150 

Leu Val Leu Lys Arg Leu Ser Asp Ala Leu Arg Asp Asn Asp Thr 

155 160 165 

Val Arg Ala Val lie Arg Ala Ser Ser Ser Asn Gin Asp Gly Lys 

170 175 180 

Ser Pro Gly lie Thr Gin Pro Thr Lys Gin Ala Gin lie Gin Leu 

185 190 195 

lie Lys Asp Thr Tyr Ala Ala Ala Gly Leu Asp Tyr Thr Gin Thr 

200 205 210 

Arg Tyr Phe Xaa 



(2) INFORMATION FOR SEQ ID NO: 49: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 722 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CTTGTTACTC GAGACTGTCT ACGAATCTCT CGAGTCGGCT GGTCAGACAA 50 
TCGAAGGCTT GCAAGGATCG CAAACCGCAG TGTATATTGG TGTAATGTGC 100 
GATGATTACG CCGAGCTCGT GTATCATGAT ACAGAGTCAA TCCCGACCTA 150 
TGCTGCAACT GGTAGTGCAC GCAGCATGAT GTCGAACCGA ATCTCTTACT 200 
TCTTTGACTG GAAGGGGCCG TCAATGACCA TTGATACTGC CTGTTCCTCT 250 
AGTCTTGTCG CTGTCCACCA GGCCGTTCAA GTTCTCAGGA GCGGAGAATC 300 
CCGCGTCGCA GTGGCTGCTG GGGCAAATCT CATCTTCGGA CCCAGTAAGT 3 50 
CTTCCTAAAA TATGAGTAGG CTCCAGTCAT TGTGATTGCT AATCACTTCA 400 
ACCATTTACA GAGATGTACA TTGCTGAGAG CAACCTCAAT ATGTTGTCCC 450 
CAACTGGSCG STCCCGAATG TGGGACGCTA ACSCGGATGG CTATGCACGA 500 
GGAGAGGGTA TTGCATCTGT CGTACTCAAA ACTCTTAGCT CTGCTATAGC 550 
AGATGGTGAT ACCATCGAAT GTTTGATCCG AGAAACCGGT GTCAACCAGG 60 0 
ATGGCCGCAC CACTGGTATC ACTATGCCAA GCTCCGCAGC CCAAGCCAGT 650 
TTGATCCGTC AGACTTACGC CAGAGCTGGT TTGGACCTGG CGAAGCAAGC 70 0 
TGATCGGCCT CAATTCTTTG AG 722 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Leu Leu Leu Glu Thr Val Tyr Glu Ser Leu Glu Ser Ala Gly Gin 

5 10 15 

Thr He Glu Gly Leu Gin Gly Ser Gin Thr Ala Val Tyr He Gly 

20 25 30 

Val Met Cys Asp Asp Tyr Ala Glu Leu Val Tyr His Asp Thr Glu 

35 40 45 

Ser He Pro Thr Tyr Ala Ala Thr Gly Ser Ala Arg Ser Met Met 

50 55 60 

Ser Asn Arg He Ser Tyr Phe Phe Asp Trp Lys Gly Pro Ser Met 

65 70 75 

Thr He Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 * 90 

Ala Val Gin Val Leu Arg Ser Gly Glu Ser Arg Val Ala Val Ala 

95 100 105 



Ala Gly Ala Asn Leu He Phe Gly Pro Lys Met Tyr He Ala Glu 

110 115 120 
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Ser Asn Leu Asn Met Leu Ser Pro Thr Gly Arg Ser Arg Met Trp 

125 130 135 

Asp Ala Asn Xaa Asp Gly Tyr Ala Arg Gly Glu Gly lie Ala Ser 

140 145 150 

Val Val Leu Lys Thr Leu Ser Ser Ala lie Ala Asp Gly Asp Thr 

155 160 165 

lie Glu Cys Leu lie Arg Glu Thr Gly Val Asn Gin Asp Gly Arg 

170 175 180 

Thr Thr Gly lie Thr Met Pro Ser Ser Ala Ala Gin Ala Ser Leu 

185 190 195 

lie Arg Gin Thr Tyr Ala Arg Ala Gly Leu Asp Leu Ala Lys Gin 

200 205 210 



Ala Asp Arg Pro Gin Phe Phe Glu 

215 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 703 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 
AATATTACTT GAGACGATCT ACGAAGGACT TGAGTCCGCC GGACTTACCA 50 
TAAAGGGGCT GCAAGGTTCC CAAACAGCTG TGTACGTCGG TCTCATGGCT 100 
GGAGACTACT ATGACATCCA GATGCGCGAC ATAGAGACTT TGCCTCGATA 150 
TGCTGCTACC GGGACTGCTC GTAGCATTAT GAGCAAC CGA GTCTCTTATT 2 00 
TCTTTGATTG GAAAGGTCCG TCCATGACAA TTGATACGGC CTGCTCTTCT 250 
TCCCTCGTTG CCGTTCATCA GGCTGTCGAG ATTCTCCGGA GAGGTGATGT 3 00 
TACCATGGCT GTGGCTGCCG GCGCCAACCT GATCTATGGT CCTGAGGCTT 3 50 
ATATATCCGA GTCGAATCTG AACATGCTGT CGCCGAGCGG AAGATCGCGC 400 
ATGTGGGATT CAAGTGCGGA CGGATACGGC CGCGGAGAAG GGTTTGCGGC 450 
AGTGATGTTG AAGACCCTGA GCGCTGCAAT TCGTGATGGA GATCATATCG 500 
AGTGCATTAT CCGGGAGACA GGAATTAACC AGGATGGCAG AACAGCCGGA 550 
ATTACCATGC CAAGTGCTGT CAGCCAGACT CGATTGATCA AAGACACATA 6 00 
TGCTCGAGCT GGACTCGATT GCAGGAAAGA AGCGGAGAGA TGCCAGTACT 650 
TTGAAGGTAA GCGAATAACT TTTCTTGATA AACGCACTTA CTAAGATCTT 700 
TAA 703 



(2) INFORMATION FOR SEQ ID NO:52: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

lie Leu Leu Glu Thr lie Tyr Glu Gly Leu Glu Ser Ala Gly Leu 

5 10 15 

Thr lie Lys Gly Leu Gin Gly Ser Gin Thr Ala Val Tyr Val Gly 

20 25 30 

Leu Met Ala Gly Asp Tyr Tyr Asp lie Gin Met Arg Asp lie Glu 

35 40 45 

Thr Leu Pro Arg Tyr Ala Ala Thr Gly Thr Ala Arg Ser lie Met 

50 55 60 

Ser Asn Arg Val Ser Tyr Phe Phe Asp Trp Lys Gly Pro Ser Met 

65 70 75 

Thr lie Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 90 

Ala Val Glu lie Leu Arg Arg Gly Asp Val Thr Met Ala Val Ala 

65 70 75 

Ala Gly Ala Asn Leu lie Tyr Gly Pro Glu Ala Tyr lie Ser Glu 

110 115 120 

Ser Asn Leu Asn Met Leu Ser Pro Ser Gly Arg Ser Arg Met Trp 

125 130 135 

Asp Ser Ser Ala Asp Gly Tyr Gly Arg Gly Glu Gly Phe Ala Ala 

140 145 150 

Val Met Leu Lys Thr Leu Ser Ala Ala lie Arg Asp Gly Asp His 

155 160 165 

lie Glu Cys lie lie Arg Glu Thr Gly lie Asn Gin Asp Gly Arg 

170 175 180 

Thr Ala Gly lie Thr Met Pro Ser Ala Val Ser Gin Thr Arg Leu 

185 .190 195 

lie Lys Asp Thr Tyr Ala Arg Ala Gly Leu Asp Cys Arg Lys Glu 

200 205 210 

Ala Glu Arg Cys Gin Tyr Phe Glu Gly Lys Arg lie Thr Phe Leu 

215 220 225 

Asp Lys Arg Thr Tyr Xaa Asp Leu Xaa 

230 

(2) INFORMATION FOR SEQ ID NO: 53: 
(i) SEQUENCE CHARACTERISTICS: 
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( A) LENGTH: 64 3 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: no 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:53: 
GCTGTTGCTG GAGGTAAGTT GGGAAGCTTT AGAAAATGCT GGCAAAGCAC 5 0 
CTGAAAAGCT AGCAGGAAGC AATACAGGTG TATTTGTTGG CATTAGCAAC 100 
TTTGATTATT CACAGTTGCA AATTAATCAA ACCGCTCAAC TAGATGCCTA 150 
TACAGGCACT GGCAATGCTT TTAGCATCGC AGCTAACCGT CTTTCCTATT 2 00 
TTCTAGACTT GCACGGACCT AGCTGGGCAG TAGACACAGC CTGTTCATCA 2 50 
TCTCTAGTAG CAGTCCATCA AGCTTGCCAA AGTCTGCGTC AAGGAGAATG 3 00 
CGAACTAGCC CTCGCTGGTG GTGTAAATCT GATTCTCACC CCACAATTAA 3 50 
CCATCACTTT TTCCCAAGCT GGGATGATGG CTGCTGATGG TCGTTGCAAA 4 00 
ACCTTTGATG CTGATGCTGA TGGTTACGTG CGGGGCGAAG GTTGTGGTGT 4 50 
TGTAATTCTC AAGCGTTTGG CCAACGCTCA ACGAGATGGA GACAATATTT 500 
TGGCAGTTAT TAAAGGTTCG GCAGTTAACC AAGATGGTCG CAGCAACGGA 550 
TTGACAGCAC CCAACGGTCA TGCCCAACAA GCAGTTATTC GCCAAGCATT 600 
ACAAAATGCC AATGTTGCAG CTGCCGAGAT TAGCTATGTA GAA 643 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 54 : 
Leu Leu Leu Glu Val Ser Trp Glu Ala Leu Glu Asn Ala Gly Lys 

5 10 15 

Ala Pro Glu Lys Leu Ala Gly Ser Asn Thr Gly Val Phe Val Gly 

20 25 30 

lie Ser Asn Phe Asp Tyr Ser Gin Leu Gin lie Asn Gin Thr Ala 

35 40 45 

Gin Leu Asp Ala Tyr Thr Gly Thr Gly Asn Ala Phe Ser lie Ala 

50 55 60 

Ala Asn Arg Leu Ser Tyr Phe Leu Asp Leu His Gly Pro Ser Trp 

65 70 75 

Ala Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 90 



Ala Cys Gin Ser Leu Arg Gin Gly Glu Cys Glu Leu Ala Leu Ala 

95 100 105 
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Gly Gly Val Asn Leu lie Leu Thr Pro Gin Leu Thr lie Thr Phe 

110 115 120 

Ser Gin Ala Gly Met Met Ala Ala Asp Gly Arg Cys Lys Thr Phe 

125 130 135 

Asp Ala Asp Ala Asp Gly Tyr Val Arg Gly Glu Gly Cys Gly Val 

140 145 150 

Val lie Leu Lys Arg Leu Ala Asn Ala Gin Arg Asp Gly Asp Asn 

155 160 165 

lie Leu Ala Val lie Lys Gly Ser Ala Val Asn Gin Asp Gly Arg 

170 175 180 

Ser Asn Gly Leu Thr Ala Pro Asn Gly His Ala Gin Gin Ala Val 

185 190 195 

lie Arg Gin Ala Leu Gin Asn Ala Asn Val Ala Ala Ala Glu lie 

200 205 210 

Ser Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 55: 

GGTTATGACC 50 
GAATCTAAGT 100 
TTAAAGCGCT 150 
TGCACACGAG 200 
CGGCACGGCC 250 
GAT TGCTAAG 300 
CAAGTTCCAC 350 
TCCTGATGGC 400 
TTGGCAAAGG 450 
GCTGATGGAG 500 
CGACGGTTCC 550 
CAGAAGTGAT 600 
ATCACCTACA 650 
655 



TCTTTTTTTG GAGTGTGCTT 
CGAAAACAGA CAAAAATCTA 
ACCTACTTAC TTAACAATCT 
GGAGTCACAA ATTACAATTG 
TTTCTTACAA ATTAAACCTG 
TGCTCTACGT CATTAGTAGC 
TTACCAGTGT GATATGGCAC 
AAAAACAAGG TTATTTCTAT 
CACTGTCGGG CCTTTGATGC 
AGCAGGTATT GTCGTGCTGA 
ACTGCATTTA TGCGGTTATC 
GAGAAGGTGA GTTACACCGC 
TGCCGAGGCT CAGGCGATCG 
TTGAA 



GGGAAGCGCT GGAAAATGCT 
ATTGGCGTTT ATGCAGGGGG 
CGCCTCACAC CCTGAACTCA 
CTAATGATAA GGACTTTATA 
AAAGGGCCGA GTATTAGTGT 
AGTTCACTTG GCATGTCGAG 
TGGCTGGCGG TATTGCGATA 
CAAGAAGGTG GCATGGCCTC 
TAAAGCACAA GGTAGCCCTT 
AAAGATTGGA AGATGCTGTA 
AAAGGTTCAG CCATCAATAA 
ACCCAGTGTA ACAGGCCAAG 
CTAACTTTGA TTCTGAAACA 



(2) INFORMATION FOR SEQ ID NO: 56: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 217 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Leu Phe Leu Glu Cys Ala Trp Glu Ala Leu Glu Asn Ala Gly Tyr 

5 10 15 

Asp Pro Lys Thr Asp Lys Asn Leu lie Gly Val Tyr Ala Gly Gly 

20 25 30 

Asn Leu Ser Thr Tyr Leu Leu Asn Asn Leu Ala Ser His Pro Glu 

35 40 45 

Leu lie Lys Ala Leu Glu Ser Gin lie Thr lie Ala Asn Asp Lys 

50 55 60 

Asp Phe lie Cys Thr Arg Val Ser Tyr Lys Leu Asn Leu Lys Gly 

65 70 75 

Pro Ser lie Ser Val Gly Thr Ala Cys Ser Thr Ser Leu Val Ala 

80 85 90 

Val His Leu Ala Cys Arg Gly Leu Leu Ser Tyr Gin Cys Asp Met 

95 100 105 

Ala Leu Ala Gly Gly lie Ala He Gin Val Pro Gin Lys Gin Gly 

110 115 120 

Tyr Phe Tyr Gin Glu Gly Gly Met Ala Ser Pro Asp Gly His Cys 

125 130 135 

Arg Ala Phe Asp Ala Lys Ala Gin Gly Ser Pro Phe Gly Lys Gly 

140 145 150 

Ala Gly He Val Val Leu Lys Arg Leu Glu Asp Ala Val Ala Asp 

155 160 165 

Gly Asp Cys He Tyr Ala Val He Lys Gly Ser Ala He Asn Asn 

170 175 180 

Asp Gly Ser Glu Lys Val Ser Tyr Thr Ala Pro Ser Val Thr Gly 

185 190 195 

Gin Ala Glu Val He Ala Glu Ala Gin Ala He Ala Asn Phe Asp 

200 205 210 



Ser Glu Thr He Thr Tyr lie 

215 

(2) INFORMATION FOR SEQ ID NO: 57; 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 765 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
ATTGCTGCTT GAAAACGTCT ATGAAGCTCT TGAAAACGGT GAGCGGTTCT 50 
TCAAGAGAAT ATTGATGCAT CAATATGCTA ACTTGATGTC AATCATCAGC 100 
TGGTATTCCT CTGAGCGAGT CCGTCTCTTC TAACACCTCC GTTTATGTTG 150 
GCTCATTCGG TGATGACTAT AAGACGATTC TCAATACCGA TTTTGAGAGT 2 00 
TGGGTCAAGT ACAAAGGCAC CGGTGTCTAT AACTCGATTC TGGCCAATCG 2 50 
AATCAGCTGG TTCTACGACT TTAAAGGAGC CAGCGTCACG CTAGATACCG 3 00 
CATGCTCGAG TAGCTTGGTA GCCGTGCATA TGGCTTGCCA GGATTTGAGG 350 
TTGGGAGAGT CTAGAATGGT CAGTGTATTT CTCTATTGAA AAGTACTAGA 40 0 
GGATTCTAAT TGACGTATTT GGATACCAGT CCGTTGTCGG CGGTGTCAAC 450 
ATCATTGGCC ATCCGTTGCT CGTCCACGAT CTAAGCAAGC TCGGAGCGCT 500 
CTCTCCTGAT GGCGTGTGCT ACACTTTCGA TGAACGGGCC AATGGATATT 550 
CCCGGGGAGA AGGTGTCGGC ACCATCGTTC TCAAACGGCT CTCTGACGCA 60 0 
ATCGAAGATG GTGATACCAT TCGCGCTATC ATCCGTGCAA GCGGGTGCAA 650 
TCAAGACGGT AAAACAGCAG GTATATTTGT CCCTTCAGTC CAAGCCCAGG 70 0 
AGCGACTTAT CCGGGATACC TATGAGAAGG CTGGGCTTGA CCGGACACGC 750 
ACGACATATT TGGAA 765 



(2) INFORMATION FOR SEQ ID NO:58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 
Leu Leu Leu Glu Asn Val Tyr Glu Ala Leu Glu Asn Ala Gly lie 

5 10 15 

Pro Leu Ser Glu Ser Val Ser Ser Asn Thr Ser Val Tyr Val Gly 

20 25 30 

Ser Phe Gly Asp Asp Tyr Lys Thr lie Leu Asn Thr Asp Phe Glu 

35 40 45 

Ser Trp Val Lys Tyr Lys Gly Thr Gly Val Tyr Asn Ser lie Leu 

50 55 60 

Ala Asn Arg lie Ser Trp Phe Tyr Asp Phe Lys Gly Ala Ser Val 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Met 

80 85 90 
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Ala Cys Gin Asp Leu Arg 

95 

Val Val Gly Gly Val Asn 

110 

Asp Leu Ser Lys Leu Gly 

125 

Thr Phe Asp Glu Arg Ala 

140 

Gly Thr He Val Leu Lys 

155 

Asp Thr He Arg Ala He 

170 

Gly Lys Thr Ala Gly He 

185 

Arg Leu He Arg Asp Thr 

200 

Arg Thr Thr Tyr Leu Glu 

215 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 709 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
TAAGTTACTG GAAACAGCAT ATACTGCGTT TGAGAACGGT GAGTACGCCT 50 
TGCGTCGTAT CCCCTCCCCC CTCATGGAAG ATCTCAATCT GATCTCGTGA 100 
AACAGCCGGC ATCGGGTTAG AAGCGGCACG AGGATCAAAC ACTTCAGTAC 150 
ATATAGGTTG TTTTAATATC GACTATACAA GCAACCATAG TAGAGATCCA 2 00 
GAGCAGATGC ACAAATATAC GGGGACTGGA GGAGCACCTT CCATGCTGTC 250 
GAACAGACTG AGTTGGTTTT TCGATCTGAG AGGACCGAGC TTGACCTTGG 3 00 
ACACGGCATG CTCTAGTAGC ATGGTTGCGC TTGATTTAGC ATGCCAGACT 3 50 
TTGCAAAGTG GACAATCTGA CATGGGTCTT GTCGGGGGTT GTAATCTCAT 400 
CTACAGCGTC GACATGACCA TGGCTCTATC CAAGCTTGGA TTTCTCTCCC 450 
ATAACAGTCG GTGCTACAGT TTTGACCATC GAGCGGATGG GTACGCCAGA 50 0 
GGTGAAGGCT TTGGAGTTTT AATTCTCAAA CGTGTCGAAG ACGCCATACG 550 
AGATGGGGAT ACTATACGAG GAGTCATTCG ATTAACAAGC TCCAATCAAG 600 
ACGGCCATAC TCCGGGAATA ACAATGCCCA GCAGAGACGC CCAAGCAAGT 650 
TTGATTAGAA AGACATACCA ACAAGCTGGA TTAGATATGC AGATGACAGG 700 
CTACTTTGA 70 9 

(2) INFORMATION FOR SEQ ID NO:60: 



Leu Gly Glu Ser Arg Met Val Ser Ser 
100 105 

He He Gly His Pro Leu Leu Val His 
115 120 

Ala Leu Ser Pro Asp Gly Val Cys Tyr 
130 135 

Asn Gly Tyr Ser Arg Gly Glu Gly Val 
145 150 

Arg Leu Ser Asp Ala He Glu Asp Gly 
160 165 

He Arg Ala Ser Gly Cys Asn Gin Asp 
175 180 

Phe Val Pro Ser Val Gin Ala Gin Glu 
190 195 

Tyr Glu Lys Ala Gly Leu Asp Arg Thr 
205 210 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Lys Leu Leu Glu Thr Ala Tyr Thr Ala Phe Glu Asn Ala Gly lie 

5 10 15 

Gly Leu Glu Ala Ala Arg Gly Ser Asn Thr Ser Val His lie Gly 

20 25 30 

Cys Phe Asn lie Asp Tyr Thr Ser Asn His Ser Arg Asp Pro Glu 

35 40 45 

Gin Met His Lys Tyr Thr Gly Thr Gly Gly Ala Pro Ser Met Leu 

50 55 60 

Ser Asn Arg Leu Ser Trp Phe Phe Asp Leu Arg Gly Pro Ser Leu 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Met Val Ala Leu Asp Leu 

80 85 90 

Ala Cys Gin Thr Leu Gin Ser Gly Gin Ser Asp Met Gly Leu Val 

95 100 105 

Gly Gly Cys Asn Leu lie Tyr Ser Val Asp Met Thr Met Ala Leu 

110 115 120 

Ser Lys Leu Gly Phe Leu Ser His Asn Ser Arg Cys Tyr Ser Phe 

125 130 135 

Asp His Arg Ala Asp Gly Tyr Ala Arg Gly Glu Gly Phe Gly Val 

140 145 150 

Leu lie Leu Lys Arg Val Glu Asp Ala lie Arg Asp Gly Asp Thr 

155 160 165 

lie Arg Gly Val lie Arg Leu Thr Ser Ser Asn Gin Asp Gly His 

170 175 180 

Thr Pro Gly lie Thr Met Pro Ser Arg Asp Ala Gin Ala Ser Leu 

185 190 195 

lie Arg Lys Thr Tyr Gin Gin Ala Gly Leu Asp Met Gin Met Thr 

200 205 210 

Gly Tyr Phe 

(2) INFORMATION FOR SEQ ID NO: 61: 
(i) SEQUENCE CHARACTERISTICS: 



WO 98/53097 



PCT/CA98/00488 



-63- 

(A) LENGTH: 64 9 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
AATGTTGCTC GAGATCACCT ACGAAGCCCT GGAGAACGCT GGACTTCCTT 50 
TGAGTAAGGT TGTCGGCTCT GATACAGCCT GCTTCATTGG TGGCTTTACA 100 
CGAGATTATG ATGATTTGAC CACTTCGGAG CTCGCGAAGA CCCTACTCTA 150 
CACAACTACC GGCAACGGCC TGACGATGAT GTCGAATCGC TTATCCTGGT 200 
TCTACGACCT TCATGGCCCG TCGGTTTCGC TCGACACAGC ATGTTCTAGC 250 
TCGCTGGTTG CACTAAACCT TGCATGCCAG ACAATCCGAG C AT CGACGAA 300 
TGACTCTCGA CAGGCGATAG TTGGAGGTGT CAATCTCATG CTGCTCCCTG 350 
ATCAGATGAC CACGATTAAT CCTCTGCATT TCTTAAGTCC TGATAGCCAA 40 0 
TGCTACTCGT TTGATGACCG TGCAAACGGT TACACCCGTG GAGAAGGTAT 450 
TGGCATACTG GTGCTCAAGC ACATCAATGA TGCTATTCGA GATGGAGACT 500 
GTATAAGGGC AGTAATCCGC GGCACTGGGG TCAACTCCGA TGGCAAGACC 550 
CCTGGCATTA CCTTGCCAAG CACGGCTGCA CAAGCCTCTT TAATTCGCGC 600 
AACGTACGCC TCGGCAGGGC TGGACCCAGC TCACACCGGC TACTTTGAA 649 

(2) INFORMATION FOR SEQ ID NO:62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: 

Met Leu Leu Glu lie Thr Tyr Glu Ala Leu Glu Asn Ala Gly Leu 

5 10 15 

Pro Leu Ser Lys Val Val Gly Ser Asp Thr Ala Cys Phe lie Gly 

20 25 30 

Gly Phe Thr Arg Asp Tyr Asp Asp Leu Thr Thr Ser Glu Leu Ala 

35 40 45 

Lys Thr Leu Leu Tyr Thr Thr Thr Gly Asn Gly Leu Thr Met Met 

50 55 60 

Ser Asn Arg Leu Ser Trp Phe Tyr Asp Leu His Gly Pro Ser Val 

65 70 75 

Ser Leu Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu Asn Leu 

80 85 90 

Ala Cys Gin Thr lie Arg Ala Ser Thr Asn Asp Ser Arg Gin Ala 

95 100 105 

lie Val Gly Gly Val Asn Leu Met Leu Leu Pro Asp Gin Met Thr 

110 115 120 
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Thr lie Asn Pro Leu His Phe Leu Ser Pro Asp Ser Gin Cys Tyr 

125 130 135 

Ser Phe Asp Asp Arg Ala Asn Gly Tyr Thr Arg Gly Glu Gly lie 

140 145 150 

Gly lie Leu Val Leu Lys His lie Asn Asp Ala lie Arg Asp Gly 

155 160 165 

Asp Cys lie Arg Ala Val lie Arg Gly Thr Gly Val Asn Ser Asp 

170 175 180 

Gly Lys Thr Pro Gly lie Thr Leu Pro Ser Thr Ala Ala Gin Ala 

185 190 195 

Ser Leu lie Arg Ala Thr Tyr Ala Ser Ala Gly Leu Asp Pro Ala 

200 205 210 

His Thr Gly Tyr Phe Glu 

215 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 747 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
TATGCTACTT GAATGCACAT ACGAAGCGTT AGAGAATGGT CAGTGAGCTA 50 
CGAGCCGATT TTCATATATC ATGGCTAACA AGTTGAAGCT GGCATACCTC 100 
TAGATAAAGT AGTAGGAGAA CCCGTAGGGG TGTACGTCGG CTCAGCTAGT 150 
TCCGATTACT CGGACATCGT GAACTCAGAC GGCGAGATGG TCTCCACTTA 200 
CACGGCCACG GGGTTGGCCG CAACGATGAT GGCAAACCGC ATATCCTATT 250 
TCTATGATCT CCGGGGGCCA AGCTTCACAT TGGACACGGC GTGTTCATCG 300 
AGTTTGATGG CGTTACACCT AGCGTGCCAA AGTCTTCGAG TCGGTGAATC 3 50 
GAAGCAAGCC ATTGTGGGCG GGGTCCACCT TGTACTGAGC CCGGATTGTA 4 00 
TGACTTCGAT GAGTTTATTA GGGTAAGACC TTCAAAATCT CCATGCAGAA 450 
TTTCTAAATC TAACCTACCA CCCTAGTTTG TTCTCTAATG ACGGCCGATC 500 
CTACACTTAT GACCATCGAG GTACTGGTTA TGGGCGCGGC GAAGGTATTG 550 
CTACCTTAGT AATAAAACCT CTTAAAGATG CGATGGAAGC CGGTGATAAC 600 
ATCCGGGCCA TCATCCGCAA TAGTGGGGCA AATCAAGATG GTCGAACACC 650 
AGGTGTGACT TTTCCAAGTC AAGATGCTCA GATAGATCTT ATGAGATCGG 700 
TATATCGTTC CGCTGGACTT GATGTACTTG ATACCGGCTA CGTGGAA 74 7 



(2) INFORMATION FOR SEQ ID NO:64: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 
Met Leu Leu Glu Cys Thr Tyr Glu Ala Leu Glu Asn Ala Gly lie 

5 10 15 

Pro Leu Asp Lys Val Val Gly Glu Pro Val Gly Val Tyr Val Gly 

20 25 30 

Ser Ala Ser Ser Asp Tyr Ser Asp lie Val Asn Ser Asp Gly Glu 

35 40 45 

Val Ser Thr Tyr Thr Ala Thr Gly Leu Ala Ala Thr Met Met 

50 55 60 

Ala Asn Arg lie Ser Tyr Phe Tyr Asp Leu Arg Gly Pro Ser Phe 

65 70 75 

Thr Leu Asp Thr Ala Cys Ser Ser Ser Leu Met Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser Leu Arg Val Gly Glu Ser Lys Gin Ala lie Val 

95 100 105 

Gly Gly Val His Leu Val Leu Ser Pro Asp Cys Met Thr Ser Met 

110 115 120 

Ser Leu Leu Gly Leu Phe Ser Asn Asp Gly Arg Ser Tyr Thr Tyr 

125 130 135 

Xaa His Arg Gly Thr Gly Tyr Gly Arg Gly Xaa Gly lie Ala Thr 

140 145 150 

Leu Val lie Lys Pro Leu Lys Asp Ala Met Glu Ala Gly Asp Asn 

155 160 165 

lie Arg Ala lie lie Arg Asn Ser Gly Ala Asn Gin Asp Gly Arg 

170 175 180 

Thr Pro Gly Val Thr Phe Pro Ser Gin Asp Ala Gin lie Asp Leu 

185 190 195 

Met Arg Ser Val Tyr Arg Ser Ala Gly Leu Asp Val Leu Asp Thr 

200 205 210 

Gly Tyr Val Glu 

(2) INFORMATION FOR SEQ ID NO: 65: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 643 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
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(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) 'ANTI -SENSE : no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 

AATTCTACTT GAAGTCGCCT ATCAAGCAAT GGAGTCAAGC GGCTGCTTAC 5 0 

GGAACCATCG ACGCGAAGCT GGGGATCCTG TGGGATGTTT TATTGGAGCT 100 

AGCTTTGCCG AATATCTTGA CAACACCTGT TCTAATCCGC CAACCAGCTA 150 

TACTTCCACT GGCACCATCA GAGCTTTCCA CTGCGGTAGA CTCAGTTATT 2 00 

ACTTTGGATG GAGCGGTCCT GCCGAGGTCA TTGATACAGC TTGCTCCTCT 2 50 

TCGTTGGTTG CTATCAATCG AGCTTGCAAG TCAGTGCAGG CGGGTGAATG 3 00 

TACAATGGCT CTTACTGGTG GAGTGAACAT TATAACTGGT ATCCACAACT 3 50 

TCTTAGATCT GGCAAAGGCT GGCTTYTTAA GCCCCACAGG CCAATGCAGA 4 00 

CCCTTTGACC AGTCTGCAGA TGGGTATTGT CGCTCAGAAG GAGCAGGACT 4 50 

TGTTGTACTA AAACTGTTAA GCCAAGCCAT AGCAGATGGA GATCAAATTT 500 

TCGGAGTTAT TCCAAGTGTG TCCACCAACC AAGGCGGATT GTCATCTTCA 550 

ATTACGATTC CTCATTCGCC TGCACAAAAA AAGTTGTATC AAACCGTGCT 600 

TCGGCAAGCC GGCATGAAGC TAGAACAGGT TAGCTACGTA GAG 643 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 
lie Leu Leu Glu Val Ala Tyr Gin Ala Met Glu Ser Ser Gly Cys 

5 10 15 



Leu Arg Asn His Arg Arg Glu Ala Gly Asp Pro Val Gly Cys Phe 

20 25 30 

lie Gly Ala Ser Phe Ala Glu Tyr Leu Asp Asn Thr Cys Ser Asn 

35 40 45 

Pro Pro Thr Ser Tyr Thr Ser Thr Gly Thr lie Arg Ala Phe His 

50 55 60 

Cys Gly Arg Leu Ser Tyr Tyr Phe Gly Trp Ser Gly Pro Ala Glu 

65 70 .75 

Val lie Asp Thr Ala Cys Ser Ser Ser Leu Val Ala lie Asn Arg 

80 85 90 

Ala Cys Lys Ser Val Gin Ala Gly Glu Cys Thr Met Ala Leu Thr 

95 100 105 

Gly Gly Val Asn lie lie Thr Gly lie His Asn Phe Leu Asp Leu 

110 115 120 



Ala Lys Ala Gly Phe Leu Ser Pro Thr Gly Gin Cys Arg Pro Phe 

125 130 135 
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Asp Gin Ser Ala Asp Gly Tyr Cys Arg Ser Glu Gly Ala Gly Leu 

140 145 150 

Val Val Leu Lys Leu Leu Ser Gin Ala lie Ala Asp Gly Asp Gin 

155 160 165 

He Phe Gly Val He Pro Ser Val Ser Thr Asn Gin Gly Gly Leu 

170 175 180 

Ser Ser Ser He Thr He Pro His Ser Pro Ala Gin Lys Lys Leu 

185 190 195 

Tyr Gin Thr Val Leu Arg Gin Ala Gly Met Lys Leu Glu Gin Val 

200 205 210 

Ser Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 809 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
AGGAAACTAC TAGAGGTCGT GTTTGAATGT TTTGAGAGTG CCGGTACACC 50 
ACTTCACGCA GTTTCAGGAG CTAATATTGG CTGCTATGTT GGGAATTTTA 100 
CGTTGGATTA TCTTGTCATG CAGTCTAAGG ATACAGACTC TTTTCATCGA 150 
TATACTGCTC CAGGAATGGG ACCTACATTG TTAGCTAACC GCATAAGTCA 20 0 
TGTTTTTAAT CTTCAAGGTC CAAGTGTTAT GCTTGATACA GCGTGTTCTT 250 
CATCGATCTA CGCTCTTCAT GCAGCTTGTG TGGCCTTGAA TGCAGATGAG 3 00 
TGCAATGCAG CAATTGTTGC TGGGGCAAAC CTAATCCAGT CACCTGAGTG 350 
GCATCTTGCA GTCTCCAAAT CAGGTGTGAT TTCACAAACT TCCACGTGTC 4 00 
ACACTTTCGA TGCTAGTGCG GATGGTTATG GGCGAGGCGA GGGCGTTGGG 4 50 
GCCCTCTATC TCAAGCGTCT AAGTGACGCA ATCCGAGATC GAGATCCTAT 500 
ACGGTCTGTT ATTCGTGGTA CAGCTGTTAA TAGGTTAGTA CATCCTCTTA 550 
CCTTTCTTTC ATGGATTAGC GAGAATTAGG GTTCCAAATG TTTGAAAGCT 600 
CGGGTTCTAA TATTCATTCA CTGGACTAGT AATGGCAAGA CAAACGGCAT 650 
CAGTCAGCCT AGTGCTTTGG CACAGGAAGC TGTGATTAAA AAAGCTTATG 700 
CAAAGGCGGG ATTACCTGTT ACCGAGACTG ACTATGTTGA GGTAAGTGAG 750 
CTATGTTTAA ATCAGAAAAC GTCATGCCAT TATTTCTTAT CCTTCACTGA 8 00 
NCTCTTACA 809 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 7 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
Arg Lys Leu Leu Glu Val Val Phe Glu Cys Phe Glu Ser Ala Gly 

5 10 15 

Thr Pro Leu His Ala Val Ser Gly Ala Asn lie Gly Cys Tyr Val 

20 25 30 

Gly Asn Phe Thr Leu Asp Tyr Leu Val Met Gin Ser Lys Asp Thr 

35 40 45 

Asp Ser Phe His Arg Tyr Thr Ala Pro Gly Met Gly Pro Thr Leu 

50 55 60 

Leu Ala Asn Arg lie Ser His Val Phe Asn Leu Gin Gly Pro Ser 

65 70 75 

Val Met Leu Asp Thr Ala Cys Ser Ser Ser lie Tyr Ala Leu His 

80 85 90 

Ala Ala Cys Val Ala Leu Asn Ala Asp Glu Cys Asn Ala Ala lie 

95 100 105 

Val Ala Gly Ala Asn Leu lie Gin Ser Pro Glu Trp His Leu Ala 

110 115 120 

Val Ser Lys Ser Gly Val lie Ser Gin Thr Ser Thr Cys His Thr 

125 130 135 

Phe Asp Ala Ser Ala Asp Gly Tyr Gly Arg Gly Glu Gly Val Gly 

140 145 150 

Ala Leu Tyr Leu Lys Arg Leu Ser Asp Ala lie Arg Asp Arg Asp 

155 160 165 

Pro lie Arg Ser Val lie Arg Gly Thr Ala Val Asn Ser Asn Gly 

170 175 180 

Lys Thr Asn Gly lie Ser Gin Pro Ser Ala Leu Ala Gin Glu Ala 

185 190 . 195 

Val lie Lys Lys Ala Tyr Ala Lys Ala Gly Leu Pro Val Thr Glu 

200 205 210 

Thr Asp Tyr Val Glu Val Ser Glu Leu Cys Leu Asn Gin Lys Thr 

215 220 225 

Ser Cys His Tyr Phe Leu Ser Phe Thr Xaa Leu Leu 

230 235 

(2) INFORMATION FOR SEQ ID NO: 69: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 658 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANT I- SENSE: no 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
TTTGCTCCTT GAGACTGTCT ACGAAGCTCT GGAAGCAGGC GGTCACACGA 5 0 
TTGAAGCGCT ACGAGGATCT GATACGTCTG TCTTTACAGG CACCATGGGC 100 
GTCGACTACA ACGATACTGT TATACGGGAC CTGAACGTCA TCCCGACGTA 150 
CTTTGCTACT GGAGTAAATC GAGCTATCAT CTCGAACCGA GTCTCATACT 2 00 
TCTTTGACTG GCATGGGCCG AGCATGACCA TCGACACAGC CTGTTCATCC 2 50 
AGTCTCGTCG CCGTGCACCA AGGAGTGAAA GCTCTTCGGA GTGGGGAGTC 3 00 
GCGTACTGCC CTGGCATGTG GGACGCAGGT CATTCTAAAT CCCGAGATGT 3 50 
ATGTTATTGA GAGCAAGCTG AAAATGCTTT CTCCTACGGG CCGCTCCCGC 4 00 
ATGTGGGATG CGGACGCGGA TGGCTACGCT CGTGGGGAGG GCGTAGCGGC 4 50 
TGTAGTGCTG AAACGGCTCA GTGACGCTAT TGCGGATGGA SATCGCATCG 500 
AGTGCATCAT CCGTGAGACA GGGTCCAACC AAGACGGCCA TTCAAATGGT 550 
ATCACGGTGC CGAGTACGGA GGCCCAAGCG GCCCTCATCC ACCAAACCTA 600 
TGCCAGAGCT GGTCTAGACC CGGAAAATAA CCCTCACGAC CGCCCTCAGT 650 
TCTTCGAA 6 58 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
Leu Leu Leu Glu Thr Val Tyr Glu Ala Leu Glu Ala Gly Gly His 

5 10 15 



Thr lie Glu Ala Leu Arg Gly Ser Asp Thr Ser Val Phe Thr Gly 

20 25 30 

Thr Met Gly Val Asp Tyr Asn Asp Thr Val lie Arg Asp Leu Asn 

35 40 45 

Val lie Pro Thr Tyr Phe Ala Thr Gly Val Asn Arg Ala lie lie 

50 55 60 

Ser Asn Arg Val Ser Tyr Phe Phe Asp Trp His Gly Pro Ser Met 

65 70 75 

Thr lie Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Val His Gin 

80 85 90 

Gly Val Lys Ala Leu Arg Ser Gly Glu Ser Arg Thr Ala Leu Ala 

95 100 105 



Cys Gly Thr Gin Val lie Leu Asn Pro Glu Met Tyr Val lie Glu 

110 115 120 
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Ser Lys Leu Lys Met Leu Ser Pro Thr 

125 

Asp Ala Asp Ala Asp Gly Tyr Ala Arg 

140 

Val Val Leu Lys Arg Leu Ser Asp Ala 

155 

lie Glu Cys lie lie Arg Glu Thr Gly 

170 

Ser Asa Gly lie Thr Val Pro Ser Thr 

185 

lie His Gin Thr Tyr Ala Arg Ala Gly 

200 

Pro His Asp Arg Pro Gin Phe Phe Glu 

215 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

TGGGCTACTC GAGACTGCTT ACAAGGCGTT CGAAAACGGT GAGTCTTGAA 50 
GCTGCACAGA TCAAGACAAG AACACTAAAT CTCTCAGCGG GCATACGCAT 100 
AGAAGAAGCC GCTGGCTCTA GAACTTCAGT TCATATCGGG AGTTTCACTC 150 
ATGATTGGAG AGACATCCTC CAAAGGGATC CACTAATGGA TGTTAGCTAC 200 
ATAGCTACCG CAACCGAGGT TTCTATGCTA GCGAGTCGAC TCAGCTGGTT 250 
TTATGATCTA AGTGGGCCYA GCATCTCCTT GGATACAGCG TGTTCGAGTA 3 00 
GCTTAATGGC TTTACATCTC GCCTGCCAGA GTCTAAAGAG TCGAGAGGCC 350 
GACATGGTAA GGCTATGCTA CTTTCTGGCT CACTCAAACT GTTTTCCATA 400 
TCTGATGCTT GCACAGGGCC TTGTTGGGAG GGGCTAATCT TCTTTTGGAT 450 
CCTGTAGGGG TTATTGGCAT AACAAATGTT GGCATGCTTT CGCCAGATGG 500 
CATTAGTTAC AGCTTTGATC ATCGTGCAAA CGGGTATGCC CGAGGAGAAG 550 
GGTTCGGAGT CGTTGTCATC AAACGCTTGG ACGATGCTCT CAGACATGGC 600 
GATACTATTC GCGGTATCGT TCGTGCCACA GGATCGAATC AAGATGGAAG 650 
AACTCCAGGG ATTACCCAAC CTGATGGAGC CGCGCAAGAA GAGCTCATCC 700 
GAGACACTTA CAAAGCTGCT GGCTTAGATA TGAGGCTAGT AAGGTATTCT 750 
TAA 753 



Gly Arg Ser Arg Met Trp 
130 135 

Gly Glu Gly Val Ala Ala 
145 150 

lie Ala Asp Gly Arg 

160 165 

Ser Asn Gin Asp Gly His 
175 180 

Glu Ala Gin Ala Ala Leu 
190 195 

Leu Asp Pro Glu Asn Asn 
205 210 



(2) INFORMATION FOR SEQ ID NO: 72: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Gly Leu Leu Glu Thr Ala Tyr Lys Ala Phe Glu Asn Ala Gly lie 

5 10 15 

Arg lie Glu Glu Ala Ala Gly Ser Arg Thr Ser Val His lie Gly 

20 25 30 

Ser Phe Thr His Asp Trp Arg Asp lie Leu Gin Arg Asp Pro Leu 

35 40 45 

Met Asp Val Ser Tyr lie Ala Thr Ala Thr Glu Val Ser Met Leu 

50 55 60 

Ala Ser Arg Leu Ser Trp Phe Tyr Asp Leu Ser Gly Pro Ser lie 

65 70 75 

Ser Leu Asp Thr Ala Cys Ser Ser Ser Leu Met Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser Leu Lys Ser Arg Glu Ala Asp Met Gly Leu Val 

95 100 105 

Gly Gly Ala Asn Leu Leu Leu Asp Pro Val Gly Val lie Gly lie 

110 115 120 

Thr Asn Val Gly Met Leu Ser Pro Asp Gly lie Ser Tyr Ser Phe 

125 130 135 

Asp His Arg Ala Asn Gly Tyr Ala Arg Gly Glu Gly Phe Gly Val 

140 145 150 

Val Val lie Lys Arg Leu Asp Asp Ala Leu Arg His Gly Asp Thr 

155 160 165 

lie Arg Gly lie Val Arg Ala Thr Gly Ser Asn Gin Asp Gly Arg 

170 175 180 

Thr Pro Gly lie Thr Gin Pro Asp Gly Ala Ala Gin Glu Glu Leu 

185 190 195 

lie Arg Asp Thr Tyr Lys Ala Ala Gly Leu Asp Met Arg Leu Val 

200 205 210 



Arg Tyr Ser 

(2) INFORMATION FOR SEQ ID NO: 73: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 

(B) TYPE: nucleic acid 



WO 98/53097 



PCT/CA98/00488 



-72- 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL : no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

ATTGTTGCTC GAAGTAACCT ATGAAGCTTT AGAGAACGGT GGGTAGTTCC 50 

AGGAAGCATT AATCAAGACA AAGCTATTGC TCACACTTTT CCAAAATAGC 100 

CGGAATACCC TTGAACCAAA TTGTGGGCCA GGATGTTGGG GTTTTTGTTG 150 

GCGGCTCAAT GTCCGACTAC CAGAACCTCC TCCACAAAGA CATCGCAAAT 2 00 

GGTCCTATTT ACCAAGCCAC TGGCACTGCC ATGAGCTTCC TAGCCAACCG 250 

AATATCTTAC ATCTATGACC TCAAGGGCCC AAGCGTAACA GTGGACACTG 3 00 

CATGCTCCTC GGGTCTCACG GCACTTCATT TAGCATGCCA GAGCATACGC 3 50 

ACTGGTGAGA TCCGACAAGC TTTGGTCGGC GGTGTATACA TTATCCTAAG 4 00 

CCCGGAGAAT ATGATTGCCA TGAGCATGCT GGGGTGATGT CTCCTGTTCC 4 50 

AGAAAGTAAT TGATAAAAGC TAATGCCAGT AGACTGTTTG GCACCGACGG 500 

TCTCTCATAC AGCTATGATC ACCGAGCAAC TGGATATGGA CGTGGTGAAG 550 

GAGGAGGCAT GATAGTCTTA AAGTCGGTAG ACGACGCGAT GGCAAACGGA 600 

GACACAATAC ATGCGGTAAT TCGGCACACA GGGACAAATC AGGATGGTAA 650 

GACCAGCGGC CCAACAATGC CCAGTCTGGA AGCCCAGGAG AGACTCATCA 700 

AGAAAGTTTA CAGCCAGGCT GGTCTGGATC CATTGGATAC AGAATATGTC 750 

GAG 753 

(2) INFORMATION FOR SEQ ID NO:74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 

Leu Leu Leu Glu Val Thr Tyr Glu Ala Leu Glu Asn Ala Gly lie 

5 10 15 

Pro Leu Asn Gin lie Val Gly Gin Asp Val Gly Val Phe Val Gly 

20 25 30 

Gly Ser Met Ser Asp Tyr Gin Asn Leu Leu His Lys Asp lie Ala 

35 40 45 

Asn Gly Pro lie Tyr Gin Ala Thr Gly Thr Ala Met Ser Phe Leu 

50 55 60 

Ala Asn Arg lie Ser Tyr lie Tyr Asp Leu Lys Gly Pro Ser Val 

65 70 75 

Thr Val Asp Thr Ala Cys Ser Ser Gly Leu Thr Ala Leu His Leu 

80 85 90 

Ala Cys Gin Ser lie Arg Thr Gly Glu lie Arg Gin Ala Leu Val 

95 100 105 



Gly Gly Val Tyr lie lie Leu Ser Pro Glu Asn Met lie Ala Met 
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110 115 120 

Ser Met Leu Gly Leu Phe Gly Thr Asp Gly Leu Ser Tyr Ser Tyr 

125 130 135 

Asp His Arg Ala Thr Gly Tyr Gly Arg Gly Glu Gly Gly Gly Met 

140 145 150 

lie Val Leu Lys Ser Leu Asp Asp Ala Met Ala Asn Gly Asp Thr 

155 160 165 

lie His Ala Val lie Arg His Thr Gly Thr Asn Gin Asp Gly Lys 

170 175 180 

Thr Ser Gly Pro Thr Met Pro Ser Leu Glu Ala Gin Glu Arg Leu 

185 190 195 

lie Lys Lys Val Tyr Ser Gin Ala Gly Leu Asp Pro Leu Asp Thr 

200 205 210 

Glu Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 692 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: 
AATGCTGCTT GAGGTAGTCT ATGAGGCGTT AGAAGACGGT AAGTCTAACG 5 0 
AATTTCAATC AGTGGTCCTG AGCTAATTGC GATCAAGCTG GCATTACGCT 100 
CGACGACATT AAGGGTTCCC AGACATCTGT CTACTGTGGG AGCTTCACCA 150 
ACGACTACCG TGAAATGCTG AACAAAGATT TGGGGTACTA CCCCAAGTAC 2 00 
ATGGCCACTG GTGTTGGAAA CTCCATCTTA GCCAACCGCA TTTCATATTT 250 
CTATGACCTA CACGGACCAA GTGTGACTGT CGACACAGCC TGCTCTCTTC 3 00 
CCCTGGTCTC ATTCCATATG GGCAACAGAT CAATCCMAGA TGGAGATGCT 3 50 
GACATCTCAA TCGTCATTGG ATCTTCGCTC CATTTTGATC CCAACATGTT 400 
CGTCACTATG ACGGACCTTG GGTTTCTCTC AACCGACGGC AGATGCCGTG 45 0 
CTTTTGACGC TAGCGGAAAG GGGTATGTCC GCGGTGAGGG CATCTGCGCT 500 
GTTGTTTTGA AACAAAAATC ACGCGCTGAA CTTCACGACA ACAACGTTCG 550 
ATCCGTCATT CGTGGCTCGG ATGTCAACCA CGACGGTGCC AAAGACGGTA 60 0 
TCACAATGCC AAACTCGAAG GCTCAGGAGA GCCTCATCAG AAAGACCTAC 650 
AAAAACGCTG GACTGAGTAC AAACGACACC CAGTACTTTG AG 6 92 



(2) INFORMATION FOR SEQ ID NO: 76: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Met Leu Leu Glu Val Val Tyr Glu Ala Leu Glu Asp Ala Gly lie 

5 10 15 

Thr Leu Asp Asp lie Lys Gly Ser Gin Thr Ser Val Tyr Cys Gly 

. 20 25 30 

Ser Phe Thr Asn Asp Tyr Arg Glu Met Leu Asn Lys Asp Leu Gly 

35 40 45 

Tyr Tyr Pro Lys Tyr Met Ala Thr Gly Val Gly Asn Ser lie Leu 

50 55 60 

Ala Asn Arg lie Ser Tyr Phe Tyr Asp Leu His Gly Pro Ser Val 

65 70 75 

Thr Val Asp Thr Ala Cys Ser Leu Pro Leu Val Ser Phe His Met 

80 85 90 

Gly Asn Arg Ser lie Xaa Asp Gly Asp Ala Asp lie Ser lie Val 

95 100 105 

lie Gly Ser Ser Leu His Phe Asp Pro Asn Met Phe Val Thr Met 

110 115 120 

Thr Asp Leu Gly Phe Leu Ser Thr Asp Gly Arg Cys Arg Ala Phe 

125 130 135 

Asp Ala Ser Gly Lys Gly Tyr Val Arg Gly Glu Gly lie Cys Ala 

140 145 150 

Val Val Leu Lys Gin Lys Ser Arg Ala Glu Leu His Asp Asn Asn 

155 160 165 

Val Arg Ser Val lie Arg Gly Ser Asp Val Asn His Asp Gly Ala 

170 175 180 



Lys Asp Gly lie Thr Met Pro Asn Ser Lys Ala Gin Glu Ser Leu 

185 190 195 

lie Arg Lys Thr Tyr Lys Asn Ala Gly Leu Ser Thr Asn Asp Thr 

200 205 210 

Gin Tyr Phe Glu 

(2) INFORMATION FOR SEQ ID NO: 77: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 77: 
TATTTTATTG GAGACAACAT ACGAAGCACT TGAAAATAGT GAGTAAGCCA 5 0 
TGACCGTATT AAGTAAAAGC T C ACGAAC AG TAAAGGTGGC ACCCCTCTGG 10 0 
CTAGCATTCG CGGCCAAAAT GTAGGCGTTT ACGTTGGTGC ATCCATGTCA 150 
GACTACAACG AGCTTTTCGC AAAGGACCCG GATACCAATT TGACATATCG 2 00 
TATTACCGGA ACTGCATCAA ATATTTTGTC AAATCGACTC TCCTACATGT 25 0 
TCGACCTTCA CGGGCCAAGT TTCACGGTGG ACACTGCGTG CTCATCAAGC 3 00 
TTGGCCGCAT TCCATCTGGC CTGTCAGAGT TTGAAGACGG GAGAGGTCCG 3 50 
GCAAGCCATC GTGGGCGGGG CTTACCTTGT ATTATCCCCA GATCCTACGA 4 00 
TCGGAATGAG CAAACTCAGG CTTTACGGCG AACATGGTCG CTCATACACT 4 50 
TACGATCACC GAGGGACTGG ATACGGTCGT GGCGAGGGCG TCGCTAGCCT 50 0 
AATTCTTAAG CCTTTACAAG ATGCTATCGA CGTGGGTGAT ACAATTCGAG 550 
CAATCATACG TAACACTGGA ATGAATCAAG ACGGGAAGAC GAACGGAATT 600 
ACGCTCCCAA GCAAAGACGC CCAAGAAAGC CTCATAAGGT CTGTCTACAC 650 
AGCTGCAGGT CTCGATCCAC TGTATACTTC CTACGTTGAG 6 90 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

lie Leu Leu Glu Thr Thr Tyr Glu Ala Leu Glu Asn Ser Gly Thr 

5 10 15 

Pro Leu Ala Ser lie Arg Gly Gin Asn Val Gly Val Tyr Val Gly 

20 25 30 

Ala Ser Met Ser Asp Tyr Asn Glu Leu Phe Ala Lys Asp Pro Asp 

35 40 45 

Thr Asn Leu Thr Tyr Arg lie Thr Gly Thr Ala Ser Asn lie Leu 

50 55 60 

Ser Asn Arg Leu Ser Tyr Met Phe Asp Leu His Gly Pro Ser Phe 

65 70 75 

Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Ala Ala Phe His Leu 

80 85 90 

Ala Cys Gin Ser Leu Lys Thr Gly Glu Val Arg Gin Ala lie Val 

95 100 105 

Gly Gly Ala Tyr Leu Val Leu Ser Pro Asp Pro Thr lie Gly Met 

110 115 120 



Ser Lys Leu Arg Leu Tyr Gly Glu His Gly Arg Ser Tyr Thr Tyr 
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125 130 135 

Asp His Arg Gly Thr Gly Tyr Gly Arg Gly Glu Gly Val Ala Ser 

140 145 150 

Leu lie Leu Lys Pro Leu Gin Asp Ala lie Asp Val Gly Asp Thr 

155 160 165 

lie Arg Ala lie lie Arg Asn Thr Gly Met Asn Gin Asp Gly Lys 

170 175 ISO 

Thr Asn Gly He Thr Leu Pro Ser Lys Asp Ala Gin Glu Ser Leu 

185 190 195 

He Arg Ser Val Tyr Thr Ala Ala Gly Leu Asp Pro Leu Tyr Thr 

200 205 210 

Ser Tyr Val Glu 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 761 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
GCGAATGCTA GAGACGGCTT ATCACGCTCT GGAGGACGGT AAGTCTAACC 50 
AGTGCAAATT TAGGGGCTAT AATCTTGGTG TGTGAGAATA ACATACCATC 100 
AGCGAGCATC CCCCTGGAGA AGTGCTTCGG CTCAGACACT TCCGTTTATA 150 
CCGGGTGCTT CACCAACGAT TATCTCAGCA TACTGCAGCA AGACTTTGAG 2 00 
GCTGAGCAAA GGCACGCAGC CATGGGAATC GCGCCCTCCA TGTTGGCCAA 250 
TCGCCTAAGC TGGTTCTTCA ACTTCAAGGG GACATCGATG AACCTGGATT 300 
CGGCCTGCTC CAGCAGTCTG GTTGCACTGC ATCTTGCTTC ACAGGACCTC 3 50 
CGTGCTGGTA CCACATCGAT GGTATGTATC GATCATAAAA TCACGTACTC 400 
CTTCATTAAT AAATAAATGT TTTAGGCACT AGTTGGAGGG GCGAATCTTG 450 
TCTACCACCC CGACTTCATG GAGATGATGT CAAACTTCAA CTTCCTGTCT 500 
CCCGACAGCC GTTCTTGGAG TTTCGATCAA CGTGCTAATG GTTATGCGCG 550 
TGGGGAAGGA ACCGCCGTGA TGGTCGTCAA ACGCCTTGCA GATGCACTGC 600 
GAGATGGAGA TACAATCAGA ACCGTAATCT GGAGTACCGG GTCGAACCAA 650 
GACGGGAGAA CACCTGGGAT CACGCAGCCA AGTAAAGAAG CGCAGTTAAA 70 0 
TCTCATCGAG CGCACCTACA AACAAGCGAA GATTGATATG GAGCCTACCA 750 
GATTCTTCGA G 761 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 
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(iii) HYPOTHETICAL: no 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Arg Met Leu Glu Thr Ala Tyr His Ala Leu Glu Asp Ala Ser lie 

5 10 15 

Pro Leu Glu Lys Cys Phe Gly Ser Asp Thr Ser Val Tyr Thr Gly 

20 25 30 

Cys Phe Thr Asn Asp Tyr Leu Ser lie Leu Gin Gin Asp Phe Glu 

35 40 45 

Ala Glu Gin Arg His Ala -Ala Met Gly lie Ala Pro Ser Met Leu 

50 55 60 

Ala Asn Arg Leu Ser Trp Phe Phe Asn Phe Lys Gly Thr Ser Met 

65 70 75 

Asn Leu Asp Ser Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 

80 85 90 

Ala Ser Gin Asp Leu Arg Ala Gly Thr Thr Ser Met Ala Leu Val 

95 100 105 

Gly Gly Ala Asn Leu Val Tyr His Pro Asp Phe Met Glu Met Met 

110 115 120 

Ser Asn Phe Asn Phe Leu Ser Pro Asp Ser Arg Ser Trp Ser Phe 

125 130 135 

Asp Gin Arg Ala Asn Gly Tyr Ala Arg Gly Glu Gly Thr Ala Val 

140 145 150 

Met Val Val Lys Arg Leu Ala Asp Ala Leu Arg Asp Gly Asp Thr 

155 160 165 



lie Arg Thr Val lie Trp Ser Thr Gly Ser Asn Gin Asp Gly Arg 

170 175 180 

Thr Pro Gly lie Thr Gin Pro Ser Lys Glu Ala Gin Leu Asn Leu 

185 190 195 

lie Glu Arg Thr Tyr Lys Gin Ala Lys lie Asp Met Glu Pro Thr 

200 205 210 

Arg Phe Phe Glu 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : genomic DNA 
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(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
AAGGAGGGGC CGCCCGGGAG AAGAAGTTAT CGTGGGCGCC GATTCGGTCG 50 
ACCGGCAGCA ATTGCAGCCA GATTGCCGCG AGGGCTTCCT CCATTCCCGG 10 0 
CGCGGGCGCA ACGAATCCGG TGTACTCCAG ATGCCGTGCG GTCCGGGGGA 150 
GAGCTGCCTG ATCCAGTTTG AGATTCTTGT TTAAAGGAAG TTCGGCCAGC 2 00 
TTCTCTATGG CGGCGGGGAC CATGTGAGCG GGGAGCAGAG CCTTCATGTG 2 50 
CTGGCGAATC GTTTCCGTGG ACGCTCCGCC GACTGCATAC GCCGCGAGAT 3 00 
ACTTCTCGCC GGGGATATCG TCTCGGACCA GCACAACGCC GTCCGTGACG 3 50 
CCCGGGCACG ACTGCAGCGC GGCCTGAATT TCGCCGAGTT CTATGCGATG 4 00 
CCCGCGAAGC ' TTGATCTGGC CGTCGTTTCT GCCCAGAAAA TCGATGCGCC 4 50 
CATCCGGCAG ATAGCGCGCG CGATCGCCCG TGCGGTACAT ACGCGCGCCC 500 
GGAAATGGGC TAAACGGGTT CGGCACAAAG TAGGCTGCGG TGAGATCGCT 550 
GCGCCCCGCA TAGCCGCGCG CGACACCGTC TCCGGCAGCG TACAGCCAGC 600 
CTTCCACTCC CGGCGGAACG GGAGCGAATT GCTCGTCGAG CACGTAGGTT 650 
TGGACGTTCG AAATTGGACG GCCGATGGGA ATCGACGGGG TCCCGGCGGG 700 
GACCGAATCG ATGACGCCAC ACGCCGTGAG CATCGTGTTC TCGGTAGGGC 750 
CGTAACCGTT CAAGAGGCGG GCGGGCTTGC CGTGCTCGAT CACCATGCGC 800 
ATCCAGTGGG GATCCAGCGC TTCGCCGCCG ACAATCACAT TGGTCAGCGA 850 
TTCGAATCCG GCTGGATCTT CGCGGGCAAC CTGATTGAAC AGAGATGCAG 900 
TAAGGATAAT CGTGTCCACG TGGAAGCGGC GAAAGGCGAG AATCAGCTCG 100 0 
CGGGGCGCCA TCAAGGTCTC TTTCGAAAGA ACGACGATTC GCGCGCCATG 1050 
CAGCAGGCCG CCCCATAACT CGAAGGTGGG AGGGTCGAAA CCGAAGGCCG 1100 
ACATCTGTCC CACGGTATCG GCGGGTGAGA ATTGTACGTA GTTGGTCCGG 1150 
CTAACGAGGT TGACAATCGC CCCGTGGGGG ACGGCGACCC CCTTGGGCTT 12 00 
GCCGGTCGTG CCGGACGTGT A 1221 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 90 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Ala Val Pro 

5 10 15 



His Gly Ala lie Val Asn Leu Val Ser Arg Thr Asn Tyr Val Gin 

20 25 30 

Phe Ser Pro Ala Asp Thr Val Gly Gin Met Ser Ala Phe Gly Phe 

35 40 45 

Asp Pro Pro Thr Phe Glu Leu Trp Gly Gly Leu Leu His Gly Ala 

50 55 60 

Arg lie Val Val Leu Ser Lys Glu Thr Leu Met Ala Pro Arg Glu 

65 70 75 

Leu lie Leu Ala Phe Arg Arg Phe His Val Asp Thr lie lie Leu 

80 85 90 
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Thr Ala Ser 
Phe Glu Ser 
Pro His Trp 
Leu Leu Asn 
Cys Gly Val 
lie Gly Arg 
Gin Phe Ala 
Ala Gly Asp 
Thr Ala Ala 
Ala Arg Met 
Gly Arg lie 
Arg Gly His 
Ser Cys Pro 
lie Pro Gly 
Ser Thr Glu 
His Met Val 
Asn Lys Asn 
Arg His Leu 
Glu Glu Ala 



Leu Phe Asn 
95 

Leu Thr Asn 
110 

Met Arg Met 
125 

— 1 1 ■*- 1 

140 

lie Asp Ser 
155 

Pro lie Ser 
170 

Pro Val Pro 
185 

Gly Val Ala 
200 

Tyr Phe Val 
215 

Tyr Arg Thr 
230 

Asp Phe Leu 
245 

Arg lie Glu 
260 

Gly Val Thr 
275 

Glu Lys Tyr 
290 

Thr lie Arg 
305 

Pro Ala Ala 
320 

Leu Lys Leu 
335 

Glu Tyr Thr 
350 

Leu Ala Ala 



-79- 
Gln Val Ala 

Val He Val 

Val He Glu 

Pro Thr Glu 

Val Pro Ala 

Asn Val Gin 

Pro Gly Val 

Arg Gly Tyr 

Pro Asn Pro 

Gly Asp Arg 

Gly Arg Asn 

Leu Gly Glu 

Asp Gly Val 

Leu Ala Ala 

Gin His Met 

He Glu Lys 

Asp Gin Ala 

Gly Phe Val 

He Trp Leu 



Arg Glu Asp 
100 

Gly Gly Glu 
115 

His Gly Lys 
130 

Asn Thr Met 
145 

Gly Thr Pro 
160 

Thr Tyr Val 
175 

Glu Gly Trp 
190 

Ala Gly Arg 
205 

Phe Ser Pro 
220 

Ala Arg Tyr 
235 



Asp Gly Gin 
250 

He Gin Ala 
265 

Val Leu Val 
288 

Tyr Ala Val 
295 

Lys Ala Leu 
310 

Leu Ala Glu 
325 

Ala Leu Pro 
340 

Ala Pro Ala 
355 

Gin Leu Leu 



Pro Ala Gly 
105 

Ala Leu Asp 
120 

Pro Ala Arg 
135 

Leu Thr Ala 
150 

Ser He Pro 
165 

Leu Asp Glu 
180 

Leu Tyr Ala 
195 

Ser Asp Leu 
210 

Phe Pro Gly 
225 

Leu Pro Asp 
240 

He Lys Leu 
255 

Ala Leu Gin 
270 

Arg Asp Asp 
285 

Gly Gly Ala 
300 

Leu Pro Ala 
315 

Leu Pro Leu 
330 

Arg Thr Ala 
345 

Pro Gly Met 
360 

Pro Val Asp 
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365 370 375 

Arg lie Gly Ala His Asp Asn Phe Phe Ser Arg Ala Ala Pro Pro 

380 385 390 



(2) INFORMATION FOR SEQ ID NO: 83 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1222 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:83: 
CGTTTCACCC CAAGAATCTC AGACCATATA TCAGCAATGG CCTTCTCCCT 50 
GGCATTGCCC GGAGCGACAT AGATCGGATC CCGAATCACA GTATCGCGAT 100 
CAAATGGCGG CAGGGCGTTT CGGTCAATCT TGCCGTTCGG CGTTAAAGGG 150 
AGAGAATCGA CAATGACGAA GGCGCTGGGC ACCATGTAGT CCGGCAGTTT 200 
TGCCTTCAGA TGGGCGCGCA ATTCGCTTAT T T CGGGAGC A CCTTCCCGTG 2 50 
CGACGATATA AGCAACTAAT TGCTTTTCTT CGCTAGGGTC TTTTGTCGTT 3 00 
GTGACCACAG CTTCTCGAAT CGGGGATGTT GCGCAACAGG ACTTCGATTT 3 50 
CTCCAGCTCG ATGCGATAGC CGCGAATCTT GACCTGATTG TCGGTGCGGC 400 
CGATAAACTC GATGTTGCCA TCCGGCAAAT AACGCGCAAG ATCGCCAGTT 450 
CGATAGAGGC GCTGCGCTGG CTCGCGATCG AATGAATGGT AGATGAACCT 500 
CTCCGCCGTC AGTTCCGGCC GGTTGAGATA CCCTCGCGCC AGTCCGTCGC 550 
CGCCAATGTA GATCTCTCCA ACCACGCCGA TCGGCACCGG ATTGAGATGA 600 
GCATCCAGTA TGTAGATCTG CGTATTCGCG ATCGGTCGGC CAATGGGCGG 650 
TAATTCTCCC CAGCACTCTG GCGGACCGTC CACAGTAAAC GCTGTCACAA 700 
CGTGGCTTTC CGTCGGCCCA TACTGGTTGA CCAAATGACA CTCGGGCAAC 750 
GTGTCAAGGA AACTTCTGAT CCGCGGCGTT ATCTGCAGCC GCTCTCCCGC 800 
CGTAATGACT TCGCGCAGCT GCGGCAAAAC CACATTCTCC ATGTGCGCGG 850 
CTTCCGCCAT CTGTTGCAGT ACGACAAAAG GCACAAAAAG TCTCTCTACT 900 
CGCTTCATTC GCAGGAAATT CAACAGGGCT GGCGGATCGC GTCGGATTTG 950 
CGCGGGCAGT AGCACCAGTG TGCCTCCTGA GCACCACGTG CTAAACATCT 1000 
CTTGAAACGA AACATCGAAA CTCAACGAGG CAAACTGTAA CGTTCGCGCC 1050 
GGCACCGAAC GAGAAAAATC CTCAATTTGC CACGCGATCA GGTTGGCAAG 1100 
CGCGCGGTGT TCCATCACCA CACCCTTCGG CTTGCCCGTC GTGCCAATCC 1150 
CGCGGCCATG GCGGCCGGGA GCATGCGACG TCGGGCCCAA TTCGCCCTAT 1200 
AGTGAGTCGT ATTACAATTC AA 1222 



(2) INFORMATION FOR SEQ ID NO: 84 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 96 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE : internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
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Gly Thr Thr 
Leu Ala Asn 
Val Pro Ala 
Ser Phe Gin 
Val Leu Leu 
Asn Phe Leu 
Val Val Leu 
Val Leu Pro 
Gin He Thr 
Cys His Leu 
Thr Ala Phe 
Pro Pro He 
Asp Ala His 
Tyr He Gly 
Glu Leu Thr 
Pro Ala Gin 
Pro Asp Gly 
Lys He Arg 
Cys Ala Thr 



Gly Lys Pro 
5 

Leu He Ala 
20 

Arg Thr Leu 
35 

Glu Met Phe 
50 

Pro Ala Gin 
65 

Arg Met Lys 
80 

Gin Gin Met 
95 

Gin Leu Arg 
110 

Pro Arg He 
125 

Val Asn Gin 
140 

Thr Val Asp 
155 

Gly Arg Pro 
170 

Leu Asn Pro 
185 

Gly Asp Gly 
200 

Ala Glu Arg 
215 

Arg Leu Tyr 
230 

Asn He Glu 
245 

Gly Tyr Arg 
260 

Ser Pro He 



-81 - 

Lys Gly Val 

Trp Gin He 
Gin Phe Ala 
Ser Thr Trp 
He Arg Arg 
Arg Val Glu 
Ala Glu Ala 
Glu Val He 
Arg Ser Phe 
Tyr Gly Pro 
Gly Pro Pro 
He Ala Asn 
Val Pro He 
Leu Ala Arg 
Phe He Tyr 
Arg Thr Gly 
Phe He Gly 
He Glu Leu 
Arg Glu Ala 



Val Met Glu 
10 

Glu Asp Phe 
25 

Ser Leu Ser 
40 

Cys Ser Gly 
55 

Asp Pro Pro 
70 

Arg Leu Phe 
85 

Ala His Met 
100 

Thr Ala Gly 
115 

Leu Asp Thr 
130 

Thr Glu Ser 
145 

Glu Cys Trp 
160 

Thr Gin He 
175 

Gly Val Val 
190 

Gly Tyr Leu 
205 

His Ser Phe 
220 

Asp Leu Ala 
235 

Arg Thr Asp 
250 

Glu Lys Ser 
265 

Val Val Thr 



His Arg Ala 
15 

Ser Arg Ser 
30 

Phe Asp Val 
45 

Gly Thr Leu 
60 

Ala Leu Leu 
75 

Val Pro Phe 
90 

Glu Asn Val 
105 

Glu Arg Leu 
120 

Leu Pro Glu 
135 

His Val Val 
150 

Gly Glu Leu 
165 

Tyr He Leu 
180 

Gly Glu He 
195 

Asn Arg Pro 
210 

Asp Arg Glu 
225 

Arg Tyr Leu 
240 

Asn Gin Val 
255 

Lys Ser Cys 
270 

Thr Thr Lys 
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275 

Asp Pro Ser Glu Glu Lys 

290 

Glu Gly Ala Pro Glu He 

305 

Lys Leu Pro Asp Tyr Met 

320 

Ser Leu Pro Leu Thr Pro 

335 

Pro Pro Phe Asp Arg Asp 

350 

Ala Pro Gly Asn Ala Arg 

365 

Glu He Leu Gly Val Lys 

380 

Ala Pro Gly Gly Pro Ser 

395 

(2) INFORMATION FOR SEQ ID NO: 85 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
AATCTACACG TCCGGCACCA CCGGCAAGCC CAAGGGGGCC ATAATCCATC 50 
ACCTGGGACT GGCGAATTAC TTGGTGTGGT GCTCGCGGGC TTACGCGATT 100 
GCTCAAGGAG TGGGAGCACC GGTCCACTCG TCGATCTCGT TCGATCTGAC 150 
GATCACTGCC TTGCTTGCCC CCTTGGTCGT CGGCCGGCGC ATCGACCTGC 2 00 
TTGATGAAGA ACTGGGCATC GAGCAACTGA GTTACGCTCT CCGGCGATCG 2 50 
CGCGACTATA GCCTGGTCAA GATCACTCCG GCTCACCTGC GCTGGCTCGG 3 00 
CGATGAACTG GGACCCTGCG AGGCCGAAGG TCGTACGCGA GCTTTCATCA 3 50 
TCGGTGGTGA GCAACTGACG GCCGAACACG TCKCATTCTG GAGGCGGCAC 4 00 
GCGCCGGGGA CGAGCCTGAT CAACGAGTAT GGTCCGACCG AGACGGTCGT 450 
CGGCTGCTGC GTGTACCGCG TGCCTCCTGA CCAGGAGATT TCGGGGCCCA 500 
TCCCGATTGG CCGACCGATC GCCAACACGC GTCTCTACGT CCTCGATCCG 550 
GATCTCGCGC TGGTACCCAT CGGCGTTGCA GGCGAGCTGT ACATCGGCGG 600 
TGCCGGGGTC GCGCGGGGGT ATCTCAACAG GCCCGGCCTG ACCGCTGAAA 650 
GGTTCATCCC CGACCCGTTC GGCAAGAAGC CGGGCGAGCG CCTCTATCGC 700 
ACCGGAGACC TCGCCCGATG GCGGTCCGAC GGTAACCTCG AGTATCTCGG 750 
CAGGGTCGAT CGCCAGGTTA AAGTCCGCGG GTTTCGGATC GAACCCGGGG 8 00 
AGATCGAACA GGCACTCGCC CGGCACTCCG CGGTACGCGA GTCCGTCGTG 850 
GTCGCAAGCG CAGGTGCATC GGACGTGCAA CGCCTCGTCG CCTATCTGGT 900 



288 285 

Gin Leu Val Ala Tyr He Val Ala Arg 
295 300 

Ser Glu Leu Arg Ala His Leu Lys Ala 
310 315 

Val Pro Ser Ala Phe Val He Val Asp 
325 330 

Asn Gly Lys He Asp Arg Asn Ala Leu 
340 345 

Thr Val He Arg Asp Pro He Tyr Val 
355 360 

Glu Lys Ala He Ala Asp He Trp Ser 
370 375 

Arg He Gly Val His Asp Asn Phe Phe 
385 390 
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TCTTGCGGAG GCAGGGCCGG CACCGCCCGA CTCGGAGCTG CGCGAGTTCC 950 

TGCGGACGTT ACTCCCCGAG CCGATGATAC CCTCGGCATT CGTTGTGCTG 100 0 

GAGACGCTCC CACTGACCCA CAACGGGAAG GTGGACCGAG AGGCCCTGCC 1050 

GGCCCCTGAG GGTGTGCCCT TCCGTGGGGA TGCTCGTTTC GTTGCTCCCC 1100 

GCGGCCCGCT CGAACAGGAG GTGGCATCGA TCTGGGGTGC AGTCCTCGGA 1150 

CTGGAGCGTA TCGGCGCCCT TGACAACTTC TTCTTCCCTC GGCGGCCCCT 12 0 0 

(2) INFORMATION FOR SEQ ID NO: 86 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 99 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Ala lie lie 

5 10 15 

His His Leu Gly Leu Ala Asn Tyr Leu Val Trp Cys Ser Arg Ala 

20 25 30 

Tyr Ala He Ala Gin Gly Val Gly Ala Pro Val His Ser Ser He 

35 40 45 

Ser Phe Asp Leu Thr He Thr Ala Leu Leu Ala Pro Leu Val Val 

50 55 60 

Gly Arg Arg He Asp Leu Leu Asp Glu Glu Leu Gly He Glu Gin 

65 70 75 

Leu Ser Tyr Ala Leu Arg Arg Ser Arg Asp Tyr Ser Leu Val Lys 

80 85 90 

lie Thr Pro Ala His Leu Arg Trp Leu Gly Asp Glu Leu Gly Pro 

95 100 105 

Cys Glu Ala Glu Gly Arg Thr Arg Ala Phe He He Gly Gly Glu 

110 115 120 

Gin Leu Thr Ala Glu His Val Xaa Phe Trp Arg Arg His Ala Pro 

125 130 135 

Gly Thr Ser Leu He Asn Glu Tyr Gly Pro Thr Glu Thr Val Val 

140 145 150 

Gly Cys Cys Val Tyr Arg Val Pro Pro Asp Gin Glu lie Ser Gly 

155 160 165 

Pro He Pro He Gly Arg Pro lie Ala Asn Thr Arg Leu Tyr Val 

170 175 180 



Leu Asp Pro Asp Leu Ala Leu Val Pro He Gly Val Ala Gly Glu 

185 190 195 
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Leu Tyr lie Gly Gly Ala Gly Val Ala Arg Gly Tyr Leu Asn Arg 

200 205 210 

Pro Gly Leu Thr Ala Glu Arg Phe lie Pro Asp Pro Phe Gly Lys 

215 220 225 

Lys Pro Gly Glu Arg Leu Tyr Arg Thr Gly Asp Leu Ala Arg Trp 

230 * 235 240 

Arg Ser Asp Gly Asn Leu Glu Tyr Leu Gly Arg Val Asp Arg Gin 

245 250 255 

Val Lys Val Arg Gly Phe Arg He Glu Pro Gly Glu He Glu Gin 

260 265 270 

Ala Leu Ala Arg His Ser Ala Val Arg Glu Ser Val Val Val Ala 

275 288 285 

Ser Ala Gly Ala Ser Asp Val Gin Arg Leu Val Ala Tyr Leu Val 

290 295 300 

Leu Ala Glu Ala Gly Pro Ala Pro Pro Asp Ser Glu Leu Arg Glu 

305 310 315 

Phe Leu Arg Thr Leu Leu Pro Glu Pro Met He Pro Ser Ala Phe 

320 325 330 

Val Val Leu Glu Thr Leu Pro Leu Thr His Asn Gly Lys Val Asp 

335 340 345 

Arg Glu Ala Leu Pro Ala Pro Glu Gly Val Pro Phe Arg Gly Asp 

350 355 360 

Ala Arg Phe Val Ala Pro Arg Gly Pro Leu Glu Gin Glu Val Ala 

365 370 375 

Ser He Trp Gly Ala Val Leu Gly Leu Glu Arg lie Gly Ala Leu 

380 385 390 

Asp Asn Phe Phe Phe Pro Arg Arg Pro 

395 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 04 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

AGGGGCCGCC GGGCGAGAAG AAGTTCGCGG TGATGCTCAC CGGCGCGTCG 50 

AGCTTCAACG CCTCCTGCCA GATCTCCGCG AGCTTGCTCT CCGTCTCCGT 100 
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GCCCGGCGCT ACGTATTGGG CGCCGGCGCT ACGGTCGATG GACGGCAGCG 150 
CCTTACGATC GATCTTGCCG TTGGCATTCA GCGGAAAGGC CTCCAGGACG 2 00 
CGCCAGCCGC TGGGAATCAT GTACTCGGGC AGGGCCAGCT TGAGGCGCAT 2 50 
CCGCAGCGCC GAGATGAGCA CCTCTTCGTC CGCGGTCTGG GCCACGACGT 3 00 
AGGCGACGAG GGCCTTGTTC TCCCCCTCTC CCTGCGCCAC GACCAGGGCG 350 
TCGTCGACGC CAGCCTCGGT CTTCAGCGCG GTCTCGATCT CGCCGAGCTC 4 00 
GATGCGGAAG CCGCGGATCT TGATCTGGTC GTCGAGGCGG CCGAGGAACT 4 50 
CGAGATCGCC GCTGGCGAGC CGGCGAACGA GGTCGCCGCT GCGATAGAGG 5 00 
CGCCCTTCGC CGAAGGGATT GGCGATGAAC TTCGCCGCCG TCAGCTCCGG 5 50 
CTGGTTGACG TAGCCTCTGG CCACCCCTGC CCCGCCAATG CACAGCTCGC 600 
CGGCCACGCC GACCGGCGCG ATCTCCAGTG CCTCGTTGAG GACATACAGC 650 
TCCGTGTTGT CCATGGCCCT GCCGATGGGC AGGCGCTCCG GCAGGCCGGC 700 
CTGGAGAGCG GCGGTGACGT CGAACATGGC GCAGCCGACC ACGGTCTCCG 750 
TGGGACCGTA GTGGTTGTAG ATCTGGGCGT GGGGGAAGCG CGTTTGCAGC 800 
TCGCGGGCGA GCGAGGCGGG AAACGATTCG CCGCCGATGA CGAAAACGTG 850 
TTGAGATGAA GCCCGGGCCG TGTCTTCCGT CAGCTCCGCG CTGTCGAGCA 900 
GAGCGAGCAT AC CGGTGAGA TGCATCGGCG TCATGCGCAG CAGATAAGCC 950 
CGTTCGTCGC CGGCCAACGC TTTCGCGAGC TCGTTCAACT CATCGCCGGG 1000 
CGTGGTCAGC GAGACGCAGC CACCCCGGAG CAAGGGAACA TACAGGCTGG 1050 
GCACGGTGAT GTCGAAGCCG TGGGAGGTGA CGAGGAGGGA GCCGGCCAAC 1100 
CCCTTCGCGT AGTAGCGCTG CGAAGCGAAG GCGCAGTAGT CACTGAGGCC 1150 
GGCGTGTCTG ATCTCCACGC CCTTCGGCTT GCCCGTCGTG CCGGACGTGT 12 00 
AGAT 12 04 

(2) INFORMATION FOR SEQ ID NO:88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 01 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Glu lie 

5 10 15 

Arg His Ala Gly Leu Ser Asp Tyr Cys Ala Phe Ala Ser Gin Arg 

20 25 30 

Tyr Tyr Ala Lys Gly Leu Ala Gly Ser Leu Val Val Thr Ser His 

35 40 45 

Gly Phe Asp lie Thr Val Pro Ser Leu Tyr Val Pro Leu Leu Arg 

50 55 60 

Gly Gly Cys Val Ser Leu Thr Thr Pro Gly Asp Glu Leu Asn Glu 

65 70 75 

Leu Ala Lys Ala Leu Ala Gly Asp Glu Arg Ala Tyr Leu Leu Arg 

80 85 90 

Met Thr Pro Met His Leu Thr Gly Met Leu Ala Leu Leu Asp Ser 

95 100 105 
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Ala Glu Leu 
Phe Val He 
Leu Gin Thr 
Pro Thr Glu 
Ala Leu Gin 
Met Asp Asn 
Ala Pro Val 
Val Ala Arg 
Phe He Ala 
Asp Leu Val 
Arg Leu Asp 
Gly Glu He 
Ala Leu Val 
Ala Tyr Val 
Ala Leu Arg 
Pro Ser Gly 
Gly Lys He 
Gly Ala Gin 
Ala Glu He 



Thr Glu Asp 
110 

Gly Gly Glu 
125 

Arg Phe Pro 
140 

Thr Val Val 
155 

Ala Gly Leu 
170 

Thr Glu Leu 
185 

Gly Val Ala 
200 

Gly Tyr Val 
215 

Asn Pro Phe 
230 

Arg Arg Leu 
245 

Asp Gin He 
260 

Glu Thr Ala 
275 

Val Ala Gin 
290 

Val Ala Gin 
305 

Met Arg Leu 
320 

Trp Arg Val 
335 

Asp Arg Lys 
350 

Tyr Val Ala 
365 

Trp Gin Glu 



-86- 
Thr Ala Arg 

Ser Phe Pro 

His Ala Gin 

Gly Cys Ala 

Pro Glu Arg 

Tyr Val Leu 

Gly Glu Leu 

Asn Gin Pro 

Gly Glu Gly 

Ala Ser Gly 

Lys He Arg 

Leu Lys Thr 

Gly Glu Gly 

Thr Ala Asp 

Lys Leu Ala 

Leu Glu Ala 

Ala Leu Pro 

Pro Gly Thr 

Ala Leu Lys 



Ala Ser Ser 
115 

Ala Ser Leu 
130 

He Tyr Asn 
145 

Met Phe Asp 
160 

Leu Pro He 
175 

Asn Glu Ala 
190 

Cys lie Gly 
205 

Glu Leu Thr 
220 

Arg Leu Tyr 
235 

Asp Leu Glu 
250 

Gly Phe Arg 
265 

Glu Ala Gly 
288 

Glu Asn Lys 
295 

Glu Glu Val 
310 

Leu Pro Glu 
325 

Phe Pro Leu 
340 

Ser He Asp 
355 

Glu Thr Glu 
370 

Leu Asp Ala 



Gin His Val 
120 

Ala Arg Glu 
135 

His Tyr Gly 
150 

Val Thr Ala 
165 

Gly Arg Ala 
180 

Leu Glu He 
195 

Gly Ala Gly 
210 

Ala Ala Lys 
225 

Arg Ser Gly 
240 

Phe Leu Gly 
255 

He Glu Leu 
270 

Val Asp Asp 
285 

Ala Leu Val 
300 

Leu He Ser 
315 

Tyr Met He 
330 

Asn Ala Asn 
345 

Arg Ser Ala 
360 

Ser Lys Leu 
375 

Pro Val Ser 
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380 385 390 

lie Thr Ala Asn Phe Phe Ser Pro Gly Gly Pro 

395 400 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1190 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
ATCTACACCT CGGGCACGAC CGGCAAGCCG AAGGGGATCA TGTATTCGCA 50 
TCGATACCTG TTGCATAATA TGCGCAACTA CGGCGACTTA TTTCAGGTCT 10 0 
CCCCCCACGA TCGCTGGAGT TGGTTGCATT CCTACAGCTA TGCTTCGGCG 150 
AATACTGATA TCCTTTGCCC GCTACTGCAC GGCGCCGCCG TCTGCCCTTG 200 
GAATTTGCAT CGTAATGGCC TATCGGGCTT AGCTCGTTGG CTCGCCGAGT 25 0 
CGCGAATCAC CATTTTGAAC TGGATGCCGA CACCGCTACG CAGTTTGGCA 3 00 
AAGCTCTGGC CGCCAAAGCA CGTGCTTCCC GATCTGCGAC TTACAGTGTT 3 50 
GGGCGGCGAA ACGCTGTTTG CCCAAGACGT TGCTGACTTT CGGCGAATAA 4 00 
TTTCGCTGAA TTGCCTAATC GCCAATCGTC TGGGAACTTC GGAAACTGGA 450 
TTGTTTCGGC TCGCGTTTCT CGACCGAGAG ACTCCCCTTG CTAATGGTTC 500 
CATACAGGCC GGATACGAAG TTCCAGACAA GACCGTCGTC CTGTTCGACG 550 
AATATGGAGT TGAGCTGGCC CCTGGCAACG TCGGTCAGAT TGGCGTGCGC 600 
AGCAGGTACT TGCCGCCTGG ATACTGGCGA CGGCCGGAGT TGACAAGCGA 650 
GCGATTTCTA ACCAGTAAAG GCGATGATGA CGTACGGACC TTCCTCACCG 70 0 
GCGACCTTGG GCGAATGCGG GACGACGGAT GCCTCGAGCA CTGCGGACGG 750 
CTCGACTCCC AAGTGAAGAT CCGTGGTCAC CGCATCGCAA TGGGAGAGAT 800 
CGAATTCTTG CTTCGGACAT GCGACGGAGT CAGCGAAGCA GTTGTCATTG 850 
CCAGGCCACA TTCAGACGGT GAAACCCGTT TGATAGCTTA TTTTGTGCCG 900 
ACCGAGAAAA GCGCTATCGA TGTATCGAGC CTTCGTCGGC ACCTGCTGGG 950 
AAAGCTGCCT GGCCACATGA TCCCCTCGGC GTTTGTGCGG CTCGACGGCG 1000 
TGCCCAAAAA CGCCAACCAA AAAGTAGATT GGGCGGCCTT GCCAGCACCG 1050 
AACTTCCAAA ACCAGGGACA GCAGCACGTA CCGCCACAAA CGCCTTGGCA 1100 
GCGACATCTC GTGGAGTTGT GGCAAAAGTT GTTGAATGTG GAATCGATCG 1150 
GCATCCACGA TGACTTCTTC GCCCTCGGCG GCCCCTCCTT 1190 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 96 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:90: 

lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly lie Met Tyr 

5 10 15 



Ser His Arg Tyr Leu Leu His Asn Met Arg Asn Tyr Gly Asp Leu 



WO 98/53097 



PCT/CA98/00488 



-88- 

20 25 30 

Phe Gin Val Ser Pro His Asp Arg Trp Ser Trp Leu His Ser Tyr 

35 40 45 

Ser Tyr Ala Ser Ala Asn Thr Asp lie Leu Cys Pro Leu Leu His 

50 55 60 

Gly Ala Ala Val Cys Pro Trp Asn Leu His Arg Asn Gly Leu Ser 

65 70 75 

Gly Leu Ala Arg Trp Leu Ala Glu Ser Arg lie Thr lie Leu Asn 

80 85 90 

Trp Met Pro Thr Pro Leu Arg Ser Leu Ala Lys Leu Trp Pro Pro 

95 100 105 

Lys His Val Leu Pro Asp Leu Arg Leu Thr Val Leu Gly Gly Glu 

110 115 120 

Thr Leu Phe Ala Gin Asp Val Ala Asp Phe Arg Arg lie lie Ser 

125 130 135 

Leu Asn Cys Leu lie Ala Asn Arg Leu Gly Thr Ser Glu Thr Gly 

140 145 150 

Leu Phe Arg Leu Ala Phe Leu Asp Arg Glu Thr Pro Leu Ala Asn 

155 160 165 

Gly Ser He Gin Ala Gly Tyr Glu Val Pro Asp Lys Thr Val Val 

170 175 180 

Leu Phe Asp Glu Tyr Gly Val Glu Leu Ala Pro Gly Asn Val Gly 

185 190 195 

Gin He Gly Val Arg Ser Arg Tyr Leu Pro Pro Gly Tyr Trp Arg 

200 205 210 

Arg Pro Glu Leu Thr Ser Glu Arg Phe Leu Thr Ser Lys Gly Asp 

215 220 225 

Asp Asp Val Arg Thr Phe Leu Thr Gly Asp Leu Gly Arg Met Arg 

230 235 240 

Asp Asp Gly Cys Leu Glu His Cys Gly Arg Leu Asp Ser Gin Val 

245 250 255 

Lys He Arg Gly His Arg He Ala Met Gly Glu He Glu Phe Leu 

260 265 270 

Leu Arg Thr Cys Asp Gly Val Ser Glu Ala Val Val He Ala Arg 

275 288 285 

Pro His Ser Asp Gly Glu Thr Arg Leu He Ala Tyr Phe Val Pro 

290 295 300 
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Thr Glu Lys Ser Ala lie Asp Val Ser Ser Leu Arg Arg His Leu 

305 310 315 

Leu Gly Lys Leu Pro Gly His Met lie Pro Ser Ala Phe Val Arg 

320 325 330 

Leu Asp Gly Val Pro Lys Asn Ala Asn Gin Lys Val Asp Trp Ala 

335 340 345 

Ala Leu Pro Ala Pro Asn Phe Gin Asn Gin Gly Gin Gin His Val 

350 355 360 

Pro Pro Gin Thr Pro Trp Gin Arg His Leu Val Glu Leu Trp Gin 
365 370 375 

Lys Leu Leu Asn Val Glu Ser lie Gly lie His Asp Asp Phe Phe 

380 385 390 

Ala Leu Gly Gly Pro Ser 

395 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1178 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE : no 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: 
AAGGAGGGGC CGCCCGGCGC GAAGAAGTTC TCGTGTAGCC CGACGCGTTC 50 
CAGCTGCAGC ACGGCGCACC AGATCGCTGC GACCTGCCGC TGGACGTCCG 100 
TCATGATCGC GGTGTCCGCT GCGGCCGCTG CCGCGCGATT CACCTGTGGA 150 
ATGGGCAGGG CCTTGCGGTC GATCTTGTCG TTCGGCGTGA GCGGCAGCGC 2 00 
GGCGAGCGAT ACGATCACCT GTGGCACCAT GTACTCGGGG AGTCTCGCGC 250 
GGAGCGCCGT CCGGAGCTCG TCGAGCGGCA GCACGCCGTC TTCTGCCGGG 3 00 
ACGACGTACG CCACCAGACG CTGATCGCCG GGGGTGTCCT CGCGCACGAC 3 50 
GGCCACGCTG CGGCGCACCG ACGGATGCTC GGACAGGACC GATTCGATCT 400 
CCCCCAGCTC GATCCGGTAG CCGCGAAGCT TCACCTGATG ATCTCGGCGT 450 
CCGACGAACT CGAGGGCCCG ATCGGCGCGC AGTCGTACGA TGTCGCCGGT 500 
GCGGTACACG CGCTCCGCCG GTCTGCCCGC GACCTCGACG ACGACGAACT 550 
TTTCTGCCGT GAGCTCGGGT CGATGACGAT AGCCCCGCGC CACGCCCTCT 600 
CCTCCGATGC ACAGCTCACC CGGCACGCCG ATGGGAGCCT GGCGACCCGC 650 
GGCGTCGAGC ACGTAGACGT TCGTGTTGGC GATGGGATGG CCGATCGGAA 700 
TATCGCGATC GCAATCCGTG ACCTGATGCA CGGTCGACCA GATCGTCGTC 750 
TCGGTCGGGC CGTACATGTT CCACAGCGCC CGCACCCTCG ACGAGAGATC 800 
GCGCGCGAGA TCGCGTGGAA GGGCCTCCCC GCCGCAGAGC GCGGTGAGAT 850 
CCGTCTTGCC CTGCCAGCCG GCGTCGATGA GCAGGCGCCA GGTCGCGGGG 900 
GTCGCCTGCA TCATCGTCGC TCTGCACGAT TCGATGCGCT CGCGAAGACG 950 
CTCGCCGTCG AGCACGTCGC CGCGGGAGGC GATGACCGTC CTCCCGCCGA 1000 
CGACGAGAGG CAAGAACAGC TCGAGACCCG CGATGTCGAA CGACGGCGTG 1050 
GTGACCGCGA GGAGCACGTC GCCGGCTCGC AAGCCTGGCT CCTTCTGCAT 110 0 
GGCGCGCAGG AAATTCACGA GCTGGCGGTG CTCGATCTCG ACCCCCTTCG 1150 



WO 98/53097 



PCT/CA98/00488 



-90- 

GCTTGCCCGT CGTGCCCGAC GTGTAGAT 1178 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 92 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys Gly Val Glu lie 

5 10 15 

Glu His Arg Gin Leu Val Asn Phe Leu Arg Ala Met Gin Lys Glu 

20 25 30 

Pro Gly Leu Arg Ala Gly Asp Val Leu Leu Ala Val Thr Thr Pro 

35 40 45 

Ser Phe Asp lie Ala Gly Leu Glu Leu Phe Leu Pro Leu Val Val 

50 55 60 

Gly Gly Arg Thr Val lie Ala Ser Arg Gly Asp Val Leu Asp Gly 

65 70 75 

Glu Arg Leu Arg Glu Arg lie Glu Ser Cys Arg Ala Thr Met Met 

80 85 90 

Gin Ala Thr Pro Ala Thr Trp Arg Leu Leu lie Asp Ala Gly Trp 

95 100 105 

Gin Gly Lys Thr Asp Leu Thr Ala Leu Cys Gly Gly Glu Ala Leu 

110 115 120 

Pro Arg Asp Leu Ala Arg Asp Leu Ser Ser Arg Val Arg Ala Leu 

125 130 135 

Trp Asn Met Tyr Gly Pro Thr Glu Thr Thr lie Trp Ser Thr Val 

140 145 150 

His Gin Val Thr Asp Cys Asp Arg Asp lie Pro lie Gly His Pro 

155 160 165 

lie Ala Asn Thr Asn Val Tyr Val Leu Asp Ala Ala Gly Arg Gin 

170 175 180 

Ala Pro lie Gly Val Pro Gly Glu Leu Cys lie Gly Gly Glu Gly 

185 190 195 

Val Ala Arg Gly Tyr Arg His Arg Pro Glu Leu Thr Ala Glu Lys 

200 205 210 
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Phe Val Val Val Glu Val Ala Gly Arg Pro Ala Glu Arg Val Tyr 

215 220 225 

Arg Thr Gly Asp lie Val Arg Leu Arg Ala Asp Arg Ala Leu Glu 

230 235 240 

Phe Val Gly Arg Arg Asp His Gin Val Lys Leu Arg Gly Tyr Arg 

245 250 255 

He Glu Leu Gly Glu He Glu Ser Val Leu Ser Glu His Pro Ser 

260 265 270 

Val Arg Arg Ser Val Ala Val Val Arg Glu Asp Thr Pro Gly Asp 

275 288 285 

Gin Arg Leu Val Ala Tyr Val Val Pro Ala Glu Asp Gly Val Leu 

290 295 300 

Pro Leu Asp Glu Leu Arg Thr Ala Leu Arg Ala Arg Leu Pro Glu 

305 310 315 

Tyr Met Val Pro Gin Val He Val Ser Leu Ala Ala Leu Pro Leu 

320 325 330 

Thr Pro Asn Asp Lys He Asp Arg Lys Ala Leu Pro He Pro Gin 

335 340 345 

Val Asn Arg Ala Ala Ala Ala Ala Ala Asp Thr Ala He Met Thr 

350 355 360 

Asp Val Gin Arg Gin Val Ala Ala He Trp Cys Ala Val Leu Gin 

365 370 375 

Leu Glu Arg Val Gly Leu His Glu Asn Phe Phe Ala Pro Gly Gly 

380 385 390 

Pro Ser 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1178 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
ATCTACACCT CCGGCACGAC GGGCAAGCCG AAGGGAGTAA AGATCACACA 50 
TCGTGCCGTG GTGAATTTTC TGAACTCGAT GCGGCGTGAA CCAGGGCTGA 100 
CCCCGGACGA TGTGGTGCTC TCGGTCACCA CGCTGTCGTT TGACATTGCC 150 
GGACTCGAAC TCCACCTGCC CCTGACGACT GGAGCCACGG TCGTAGTGGC 200 
GACCCAAGAC GCGGTGTCCG ACGCTGAACT GCTGGTCAGA GAGTTGGAGC 250 
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GGACCGGAAC 
CTGGAGTCGG 
TGAGGCAGTG 
CACTTTGGAA 
GGGCGTCTGG 
CAATACGCGG 
GAGTTCCCGG 
CTGAAACGCG 
TGGGAGGCCT 
GCGCGGACGG 
ATTCGGGGTT 
CCACCCGGAT 
GGGAAAAAAA 
GAAGTGATGG 
GGTCCCCTCA 
GAAAGATCGA 
GTTTCCCGAG 
AGCAATTTTC 
GCTTCTTCTC 



AACTCTGTTG 
GCTGGAAAGG 
CCGAGGGACC 
CATGTACGGA 
AGGCTGGAGA 
ATTTACGTCG 
CGAATTGCTG 
ATCAGTTGAC 
GGGTCTCGGC 
CACCTTGGAG 
CCCGGATCGA 
GTGAAACAGA 
ATTGGTGGGC 
AATTTCGCAA 
GTGTACGTGC 
CCGCAAGGCG 
AGTCAATTGC 
GCCAAGGTGC 
CCCGGGCGGC 



CAGGCGACGC 
AAATCCGCGA 
TGGTGAATCG 
CCAACGGAAA 
TGGTGTGTCT 
TGGATCCGTC 
ATTGGCGGAG 
GGCAGAGAAG 
TGTATCGAAC 
TGTCTCGGAC 
ATTGGGTGAG 
ACGTGGTGGT 
TATTTCGTGC 
ACATCTGCAG 
CCTTGACCTC 
CTGCCCGCAC 
GCCGCGCAAT 
TTGGCACGCC 
CCCTCCAT 



CAGTCACATG 
CTCAAGGCTC 
GCTTGCTCCC 
CCACGATCTG 
AGTATTGGCC 
GATACACCTT 
AAGGATTGGC 
TTCATTCCTG 
CGGAGATCTT 
GAATGGACCA 
ATCGAAACCC 
CGTACGCGAG 
CGGCGAACGG 
CGGACGCTTC 
GGTTCCGCTT 
CGGATATCAG 
CCCGCCGAAG 
GATCGCCTCG 



GCGAATGCTT 3 00 
TGGTCGGAGG 350 
CTTTGCGCGT 40 0 
GTCAACGGTT 450 
GGCCCATCGA 500 
CAGCCCATCG 550 
CGACGGATAT 600 
ATCCATTTGG 650 
GCGCGCTGGC 70 0 
ACAGGTGAAG 750 
TGTTGGCCTC 800 
GACAGCCCCG 8 50 
ACGCAATCCC 900 
CGGATTACAT 950 
ACACCCAACG 100 0 
CGCCGTCACG 1050 
AGCGGCTGGC 1100 
ATCCACGACA 1150 
1178 



(2) INFORMATION FOR SEQ ID NO: 94 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: 

(A) DESCRIPTION: protein 

(iii) HYPOTHETICAL: no 

(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
lie Tyr Thr Ser Gly Thr Thr Gly Lys Pro Lys 

5 10 

Thr His Arg Ala Val Val Asn Phe Leu Asn Ser 

20 25 

Pro Gly Leu Thr Pro Asp Asp Val Val Leu Ser 

35 40 



Gly Val 
Met Arg 
Val Thr 



Ser Phe Asp lie Ala Gly Leu Glu Leu His Leu Pro Leu 

50 55 

Gly Ala Thr Val Val Val Ala Thr Gin Asp Ala Val Ser 

65 70 

Glu Leu Leu Val Arg Glu Leu Glu Arg Thr Gly Thr Thr 

80 85 

Gin Ala Thr Pro Val Thr Trp Arg Met Leu Leu Glu Ser 

95 100 

Lys Gly Asn Pro Arg Leu Lys Ala Leu Val Gly Gly Glu 

110 115 



Lys lie 
15 

Arg Glu 
30 

Thr Leu 
45 

Thr Thr 
60 

Asp Ala 
75 

Leu Leu 
90 

Gly Trp 
105 

Ala Val 
120 



Pro Arg Asp Leu Val Asn Arg Leu Ala Pro Leu Cys Ala Ser Leu 
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125 130 135 

Trp Asn Met Tyr Gly Pro Thr Glu Thr Thr lie Trp Ser Thr Val 

140 145 150 

Gly Arg Leu Glu Ala Gly Asp Gly Val Ser Ser lie Gly Arg Pro 

155 160 165 

lie Asp Asn Thr Arg lie Tyr Val Val Asp Pro Ser lie His Leu 

170 175 180 

Gin Pro lie Gly Val Pro Gly Glu Leu Leu lie Gly Gly Glu Gly 

185 190 195 

Leu Ala Asp Gly Tyr Leu Lys Arg Asp Gin Leu Thr Ala Glu Lys 

200 205 210 

Phe lie Pro Asp Pro Phe Gly Gly Arg Pro Gly Ser Arg Leu Tyr 

215 220 225 

Thr Gly Asp Leu Ala Arg Trp Arg Ala Asp Gly Thr Leu . Glu 

230 235 240 

Cys Leu Gly Arg Met Asp Gin Gin Val Lys lie Arg Gly Ser Arg 

245 250 255 

Glu Leu Gly Glu lie Glu Thr Leu Leu Ala Ser His Pro Asp 

260 265 270 

Lys Gin Asn Val Val Val Val Arg Glu Asp Ser Pro Gly Glu 

275 288 285 

Lys Lys Leu Val Gly Tyr Phe Val Pro Ala Asn Gly Arg Asn Pro 

290 295 300 

Glu Val Met Glu Phe Arg Lys His Leu Gin Arg Thr Leu Pro Asp 

305 310 315 

Tyr Met Val Pro Ser Val Tyr Val Pro Leu Thr Ser Val Pro Leu 

320 325 330 

Thr Pro Asn Gly Lys lie Asp Arg Lys Ala Leu Pro Ala Pro Asp 

335 340 345 

lie Ser Ala Val Thr Val Ser Arg Glu Ser lie Ala Pro Arg Asn 

350 355 360 

Pro Ala Glu Glu Arg Leu Ala Ala lie Phe Ala Lys Val Leu Gly 

365 370 375 

Thr Pro lie Ala Ser lie His Asp Ser Phe Phe Ser Pro Gly Gly 

380 385 390 

Pro 
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1 1 . A method for recovery of antibiotic biosynthetic DNA from humic 

2 materials or lichen comprising the steps of: 

3 (a) combining a humic or lichen-derived sample with a set of 

4 amplification primers under conditions suitable for polymerase chain reaction amplification, 

5 wherein the primer set is a degenerate primer set selected to hybridize with conserved regions 

6 of antibiotic biosynthetic gene; 

7 (b) cycling the combined sample through a plurality of amplification 

8 cycles to amplify DNA complementary to the primer set; and 

9 (c) isolating the amplified DNA. 

1 2. The method according to claim 1, wherein the primer set hybridizes 

2 with a polyketide synthase gene. 

1 3. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 1 and 2. 

1 4. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 3 and 4. 

1 5. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 5 and 6. 

1 6. The method according to claim 2, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 1 1 and 12. 



1 7. The method according to claim 1, wherein the primer set hybridizes 

2 with a isopenicillin N synthase gene. 
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1 8. The method according to claim 7, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 7 and 8. 

1 9. The method according to claim 1 , wherein the primer set hybridizes 

2 with a peptide synthetase gene. 

1 10. The method according to claim 9, wherein the primer set comprises 

2 primers having the sequence set forth in SEQ ID Nos. 9 and 10. 

1 11. The method according to any of claims 1 to 10, wherein the sample 

2 comprises DNA extracted from a soil sample. 

1 12. The method according to claim 1, wherein the sample is a lichen- 

2 derived sample. 

1 13. The method according to any of claims 1 to 12, further comprising the 

2 steps of cloning the isolated DNA into a host organism, and isolating the cloned DNA. 

1 14. The method according to claim 13, wherein the host organism is E. 

2 colt. 

1 15. An oligonucleotide primer having the sequence as defined in any of 

2 Seq. ID. Nos. 1 through 8. 

1 16. A composition comprising two oligonucleotide primers having the 

2 sequence as defined in Seq. ID Nos. 1 and 2; 3 and 4; 5 and 6; or 7 and 8. 

1 17. A polynucleotide comprising a region having the sequence given by 

2 anyofsequencelDNos. 13, 15, 17, 19,21,23,29,31,33,35,37,39,41,43,45,47,49,51, 

3 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91 or 93. 



i 
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1 18. A biosynthetic polypeptide encoded by a polynucleotide comprising a 

2 region having the sequence given by any of sequence ID Nos. 13, 15, 17, 19, 21, 23, 29, 31, 

3 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79 81, 

4 83,85,87,89,91 or 93. 

1 19. The biosynthetic polypeptide of claim 18, wherein the polypeptide has 

2 the amino acid sequence given by any of Sequence ID Nos. 14, 16, 18, 20, 22, 24, 26, 28, 30, 

3 32, 3,4 3,6 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 

4 82, 84, 86, 88, 90, 92 or 94. 



